Direct memory access (DMA) technology has been around for more than 20 years. DMA has been used principally to offload memory accesses (reading and/or writing) from the CPU in order to enable the processor to focus on computational tasks and increase the performance of embedded and other system designs.
Traditionally, there have been many components with a DMA engine inside including microprocessors (CPUs), disk drive controllers, graphics processors and various end-points. The DMA engine in all of these devices is used to transfer data between memory and I/O devices without the involvement of the core central processing unit.
DMA is also used for intra-chip data transfer in the increasingly popular and widely used multi-core processors, especially in multiprocessor systems-on-chip applications. As will be shown later, its processing element is equipped with a local memory (often called scratchpad memory) and DMA is used for transferring data between the local memory and the main memory.
DMA is crucial for system input/output (I/O); without it, using programmed input/output (PIO) mode for communication with peripheral devices, or load/store instructions in the case of multi-core chips, the CPU typically is fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other, more crucial computational tasks.
With DMA, however, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is especially useful in real-time computing applications, in which it's critical the processor's primary job doesn't stall behind concurrent operations.
Another related application area can be found within various forms of stream processing where it is essential to have data processing and transfers in parallel, in order to achieve sufficient throughput.
DMA Engines Extend PCI Express Performance
Over the years, DMA has taken a broad range of forms in board- and system-level designs. DMA is used in almost all applications and markets - among them storage, servers, communications, embedded, and industrial.
Since the concept of faster data transfer is simple, all these applications use DMA engines in their systems. Up until now the DMA engines were either in the processors or the chipsets or the endpoints.
Designers have made it clear to PLX and other leading component providers that DMA needs to play a more critical role in embedded-systems' interconnect schemes.
So, a new and revolutionary concept of adding performance-enhancing DMA to a system now can be found in PCI Express (PCIe) switching devices featuring built-in DMA engines. DMA engines in a PCIe Gen 2 (5.0GT/s) switch provide more options for embedded-systems designers:
1. Some processors do not have DMA, so the DMA in a PCIe switch covers the needs of system designers who use such processors.
2. Some processors have limitations in DMA implementation - the DMA engines can perform write functions but cannot perform read functions. Since the DMA engine in a PCIe switch performs both the write and read functions, it can cover the needs of designers who use such processors.
3. Designers often are forced to use expensive and higher-power processors just for the DMA function, whereas a DMA engine built into a PCIe switch gives them more options to choose a cheaper, lower-power processor and use the DMA in a PCIe switch without compromising the price and quality of their system.
4. DMA built into a PCIe switch complements the DMA in a processor and/or end-point, providing higher performance for those designers who wish to differentiate their systems.