Architectures of multiprocessor systems

Personal computers make it possible to implement many computer technologies. However, there are problems for the solution of which computers with much higher performance are used. To obtain high performance on the existing element base, the architecture of the so-called multiprocessor systems is used, in which the processing process is parallelized and performed simultaneously on several devices. There are three main architectures of multiprocessor systems: multiprocessor, backbone, and matrix architectures.

There are three main architectures of multiprocessor systems: multiprocessor, backbone, and matrix architectures

The architecture of simple multiprocessor systems is based on a common bus scheme. Two or more processors and one or more memory modules are located on a common bus. Each processor, in order to exchange with memory, checks whether the bus is free, and if it is free, it occupies it. If the bus is busy, the processor waits for it to become free. As the number of processors increases, system performance will be limited by the bus bandwidth. To solve this problem, each processor is supplied with its own local memory (Fig. 1(a)), where the texts of executable programs and local variables processed by this processor are placed. Shared storage is used to store shared variables and shared system software. Thus, the load on the common bus is significantly reduced.

One of the processors is dedicated to control the entire system. It distributes tasks for executing programs between processors and controls the operation of the common bus.

The peripheral processor carries out maintenance of external devices when inputting and outputting information from the shared memory. It can be of the same type as other processors, but usually a specialized processor is installed, designed to perform operations for controlling external devices.

The backbone architecture of multiprocessor systems is the most common in the construction of high-performance computing systems. The processor of such a system has several functional processing devices that perform arithmetic and logical operations, and a fast register memory for storing the processed data. Data read from memory is placed in registers and loaded from them into processing devices. The calculation results are placed in registers and used as input data for further calculations. Thus, a data transformation pipeline is obtained: registers - processing devices - registers - ... The architecture of the backbone supercomputer is shown in Fig. 1(b). The number of functional devices here is equal to six ("Addition", "Multiplication", etc.), however, in real systems, their number may be different. The sequence scheduling device distributes the data stored in the registers to the functional devices and writes the results back to the registers. The final results of the calculations are recorded in the shared memory.

In the matrix architecture of multiprocessor systems, processors are combined into a matrix of processor elements. As processing elements, universal processors with their own control device or computers containing only ALU and executing commands of an external control device can be used. Each processing element is equipped with a local memory storing data processed by the processor, but, if necessary, the processing element can exchange with its neighbors or with a common memory device. In the first case, programs and data of several tasks or independent parts of one task are loaded into the local memory of the processors and are executed in parallel. In the second variant, all the processing elements simultaneously execute the same command coming from the command processing device to all the processing elements, but on different data stored in the local memory of each processing element.

Tools