Getting aboard the PCI Express

PCI Express has emerged as the heir to PCI bus designs, bringing with it both backward compatibility and allowance for future enhancements.

As popular as it has become for embedded computing design, the PCI bus is faltering. Signal skew and fan-out restrictions limit the bandwidth achievable on a parallel bus and the PCI bus has reached that limit. Future system designs will have to rely on PCI Express (PCI-E), which has arisen to break the bandwidth barrier while maintaining software compatibility with existing PCI hardware.

PCI-E fixes the inability of conventional parallel buses to scale to today's processor speeds. A parallel bus such as PCI must keep many signal lines synchronized in order to reliably send information. Minor variations in signal trace layout and load capacitance from line to line affect signal-propagation times, resulting in skew among the signals. Bus timing has to accommodate this skew by positioning the clock signal within a time window that allows the other signal lines to stabilize at all points along the bus before the clock arrives.

Skew in a typical design is only a few nanoseconds but, as bus-clocking frequencies increased, those nanoseconds became an ever-larger portion of the bus's cycle time, shrinking the clock window. Skew also tends to increase with signal frequency, shrinking the window even further. Careful control of signal trace layout and loading can ease the situation, but this limits the design's ability to accommodate plug-in cards. For the 64-bit PCI bus, load-dependent skew generally limits designs to no more than two or three plug-in cards running at a clock rate under 300MHz.

PCI-E addresses this problem through a point-to-point, switched-serial topology, as shown in Figure 1. The main processor and memory connect to a root/host complex, which in turn provides individual serial links to switches, bridges, or endpoints. The serial links use 8b/10b encoding to provide self-clocking capability and differential signaling to operate at a speed of 2.5GHz, for a data rate of 250MBps (conventional PCI runs at 132MBps). Nonblocking switches establish the physical connections between commusnicating units as needed and allow a PCI-E system to accommodate either hard-wired or plug-in endpoints.

View the full-size image

The links in PCI-E are organized into 4-wire lanes, with one serial connection for each direction in each lane. This dual-unidirectional linking allows data transfers to take advantage of a full 250Mbps serial connection bandwidth for both read and write operations simultaneously. PCI-E also allows the paralleling of lanes into channels, providing bandwidth scalability. Channels containing x1, x2, x4, x8, x12, x16, and x32 lanes are allowable. A 64-wire PCI-E connection–16 lanes–would provide 8GBps of total bandwidth, far exceeding the limits of the comparable 64-bit PCI bus.

PCI compatibility
Despite its switched-serial topology, a PCI-E system is software compatible with existing PCI applications and drivers. Through the use of an appropriate bridge, PCI-E is able to use legacy PCI hardware as well. The key to this compatibility lies in the lower-layer hardware, which hides the underlying bus structure and mimics PCI bus behavior to the software. This concealment occurs at the Transaction, Link, and Physical layers.

The Transaction layer interacts with the system software to receive PCI memory-mapped read and write commands. It takes those commands and forms them into packets for transfer to the Link layer, as shown in Figure 2. Each packet contains unique identifiers and incorporates protocol information such as transaction type, recipient address, packet transfer size, and the like. Similarly, the Transaction layer receives response packets from the Link layer and uses the identifier to direct the response to the correct software element.

View the full-size image

The Transaction layer prioritizes the various transactions into traffic classes and maps them into Virtual Channels (VCs). The mapping may be to a single VC or to multiple VCs, depending on the system requirements and the endpoints involved. This mapping allows the Transaction layer to provide deterministic latencies, setting the stage for future Quality of Service controls in data handling. Each VC has a dedicated FIFO buffer and control logic in the Transaction layer hardware, allowing VCs to operate independently.

In its interface to the system software, the Transaction layer supports all three PCI address spaces—memory, I/O, and configuration space—and handles both 32-bit and extended 64-bit memory addressing. That means it's able to fully mimic the load/store architecture and flat memory space of PCI. The Transaction layer also includes a Message Space, which PCI-E uses to handle all the sideband signals of the PCI bus. Sideband signals include interrupts, power-management requests, and reset commands. The Message space, in essence, provides “virtual wires” to replace these signals.

The Transaction layer is where the PCI-E node types (root complex, bridge, switch, and endpoint) manifest their differences. The root complex handles transactions on behalf of the processor. Bridges connect other bus structures to the PCI-E fabric. Forward bridges, for instance, connect to legacy PCI subsystems while reverse bridges make PCI-E a subsystem to bus masters, such as a PCI host bridge.

Switches in PCI-E provide the arbitration needed when multiple transactions contend for the same resources. Both port and VC arbitration reside with the switch. Port arbitration handles the situation of packets on different ports simultaneously entering the switch and may be based on plain, weighted, or time-based round-robin schemes. VC arbitration handles competing packets in VC buffers attached to the switch and may be based on round-robin, weighted round-robin, or strict-priority schemes.

Common structure
Although Transaction layers differ, the Link and Physical layers for PCI-E interfaces are essentially identical. The Link layer serves to ensure reliable delivery of data packets across PCI-E channels; it contains a data-link control and management state machine. The Link layer hardware starts a transaction by attaching a packet sequence number and a cyclic redundancy check (CRC) character to the data packet. The CRC allows the receiving Link layer to detect transmission errors. In the event that a data error occurs, the transmitting Link layer's state machine can automatically resend packets. The sequence number allows the hardware on the receiving end to properly reassemble data blocks even if they arrive out of order because of such resending.

The Link layer also implements an interesting credit-based flow-control protocol. This protocol ensures that packets get transmitted only when the corresponding VC buffer at the receiving end is able to handle the packet. In the event the buffer is not ready, the Link layer is free to send a lower-priority packet to another destination.

In addition to handling the packets moving to and from the Transaction layer, the Link layer generates packets of its own. These Link layer packets handle link management functions such as flow-control information, Transaction layer packet acknowledgements, and power management. The Link layer packets are generated and consumed at the Link layer, and don't affect higher layer operations.

The Physical layer is where PCI-E components interact electrically. It has three sub-layers: a media-access controller (MAC), a physical coding sub-layer (PCS), and the physical-medium attachment. The MAC provides state machines for link training, initialization, signal rate and lane-width negotiations, and lane-to-lane de-skew. Hardware at this sub-layer “stripes” data bytes across the available lanes for transmission, shown in Figure 3, and reassembles them at the receiving end.

View the full-size image

The PCS sub-layer implements the 8b/10b encoding and decoding and provides clock recovery. The PCS also provides elasticity buffers to allow rate matching between lanes. The physical medium attachment layer simply implements the high-speed differential signaling that forms the backbone of PCI-E. The layer tolerates hot-swapped connections and can detect the presence of a receiver, forming the basis for PCI-E to implement live insertion with plug-and-play response to the addition and removal of endpoints.

Easing transitions, upgrades
This sub-layer approach to the Physical layer has the advantage of isolating the Link layer from the details of the physical interface, so PCI-E could theoretically adopt other signaling and coding schemes without affecting any of the higher layers. It is part of PCI-E's overall ability to isolate system software from its underlying hardware that makes transitioning a design from PCI to PCI-Express relatively easy. All applications and operating-system software will transfer to a new design unaltered. Even device drivers and legacy hardware can be reused if the new system design incorporates a PCI-to-PCI-E bridge. Legacy hardware, however, is still bound by its original PCI bus bandwidth limitations, even if the bridge provides much higher capability.

The bandwidth advantages of PCI-E don't become fully available until new PCI-E system elements come into play. The PCI-E also offers some software enhancements that new system designs can employ. One of the most important is the ability to assign attributes to transactions that will result in special handling by the system. These attributes include priority coding, which helps ensure that mission-critical data will get preferential treatment in the system. In addition, PCI-E supports an optional isochronous mode, which guarantees timely delivery of data for real-time applications.

The building blocks for PCI-E are now widely available as cores, components, and boards. Cores for host/bridge root complexes, memory controllers, I/O controllers, and endpoint controllers, for instance, can be obtained for a variety of ASIC and FPGA technologies. Among the FPGAs, devices both with and without integrated physical-medium attachment cores are available, offering a choice between an integrated multilane approach and a low-cost component-based single lane, depending on system needs.

Standalone components are also on the market. Many, such as the numerous PCI-E graphics controllers, target PC designs rather than embedded systems. Still, there are switches, bridges (PCI and PCI-X), Fibre Channel and other high-speed bus controllers, and I/O controller devices that have applicability beyond just the PC. A current listing of available PCI-E building blocks can be found at the PCI-SIG website, www.pcisig.com/developers/compliance_program/integrators_list.

As with components, PCI-E cards on the PCI-SIG website heavily favor PC applications but that's changing. Companies such as National Instruments have developed image-acquisition cards based on PCI-E. In addition, embedded computing board vendors such as Octagon and WinSystems are developing a series of small-form-factor boards that incorporate the PCI-E bus. These boards, called EPIC Express and shown in Figure 4, allow developers to connect boards by stacking them instead of using a backplane and plug-in cards. Boards following the EPIC Express standard are expected to emerge later this year.

In the meantime, more than enough components and development boards are available for system designers to begin creating their PCI-E upgrades. Host/Root controllers will allow creation of motherboards with PCI-E capability. By adding switches and bridge devices, developers can create a system that uses legacy PCI components and boards while waiting for comparable PCI-E products to become available.

The advantage of starting now is that the system will be poised to gain performance boosts as new technology becomes available. Once the entire system is PCI-E-based, the design will be ready to take advantage of new physical-medium interfaces, such as 10-Gbps devices, as they become available. Boarding the PCI Express bus thus becomes an exercise in extending the performance of current systems while future-proofing them at the same time.

Rich Quinnell is a contributor to many publications including EDN, Test and Measurement World, EE Product News, and has been covering electronic technology for more than 15 years, drawing his experience as an embedded system designer for nearly as long. You may reach him at .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.