Sub-microsecond interconnects: PCIe, RapidIO and other alternatives
As Moore’s Law has continued to drive the performance and integration of processors ever higher, the need for higher-speed interconnects has continued to grow as well. Today’s interconnects commonly sport speeds ranging from 1 to 40 Gigabits per second and have roadmaps leading to hundreds of gigabits per second.
In the race to faster and faster speeds for interconnects what is often not discussed are the types of transactions supported, the latency of communications, the overhead of communications and what sorts of topologies can be easily supported.
We tend to think of all interconnects being created equal and having a figure of merit based solely on peak bandwidth. Reality is quite different. Much as there are different forms of processors that are optimized for general purpose, signal processing, graphics and communications applications, interconnects are also designed and optimized to solve different connectivity problems.
Typically an interconnect will solve the problems it was designed for very well and can be pressed into service to solve other problems, but it will be less efficient in these applications. It is instructive to review three important interconnects in this context. These interconnects are PCI Express in the Gen 2 and Gen 3 form, Ethernet in the increasingly popular 10 Gigabit form and the second generation and third generation RapidIO technology introduced in 2008.
Each of these technologies has moved to a multi-lane SerDes physical layer using 8B/10B line coding or more efficient line encodings like 64B/66B line coding for the higher speed offerings. While PCI Express and RapidIO offer wider interfaces than 4 lanes, wider interfaces will not typically be used across backplanes or on to Fiber or Cable connections. The Gen 3 RapidIO standard extends the 64B/66B scheme of the 10G Ethernet KR standard with an extra polarity inversion bit (64B/67B) that guarantees continuing DC balance of the transmitted bitstream.
The following table presents the typical bandwidth and lane configurations for PCI Express, RapidIO and 10 Gig Ethernet as used in processor connectivity applications.
This article will focus, not on the raw bandwidths of the interconnect technologies, but rather on the inherent protocol capabilities, supported topologies and latency design targets for each of these interconnects. By doing this we gain a better understanding of where it makes sense to use each technology.
PCI Express Transactions and Topology
PCI Express was designed, in 2003, to connect peripheral devices, typically slave devices like Ethernet NICs and graphics chips to a main host processor. It was not designed as a processor to processor interconnect but rather as a serialized version of the PCI bus.
The acronym PCI stands for peripheral component interconnect. PCI Express retains the same programming model and approach to connectivity. Topologically PCI Express can support a hierarchy of buses with a single root complex. PCI Express switches have explicit upward (towards the root complex) and downward (towards attached devices) directions. Switches are primarily designed to expand peripheral device connectivity in systems.
Natively PCI Express does not support peer-to-peer processor connectivity. Using PCI Express for this sort of connectivity can be exceedingly complex. When you try to build a multi-processor interconnect out of PCI, you, of necessity, must step beyond the base PCI specification and create new mechanisms to map address spaces and device identifiers among multiple host or root processors. To date none of the proposed mechanisms to do this -- Advanced Switching (AS), Non-transparent Bridging (NTB) or Multi-Root – I/O Virtualization (MR-IOV) -- have been commercially successful nor do they support arbitrary topologies.
PCI Express is not a routable protocol, like Ethernet or RapidIO. It defines a single large address space that devices are mapped into. Performing load or store operations to addresses in to the address range associated with a specific device is the most common way to communicate across PCI Express.
PCI Express bridge or switch devices must detect the targeted device by comparing the 32-bit or 64-bit address contained in the packet against a set of base and limit values and forward the packet to the device or downstream switch that is associated with the address contained in the packet.
A separate ID routing scheme is also support where devices are identified by bus number, device number and function number. This ID routing scheme is typically used for configuration and for message based communication. This scheme is not useful for transferring data. The bus number, device number and function numbers for ID routing, like the address space allocations are assigned during system bring up and discovery.
PCI Express packet routing uses three different algorithms, depending on the packet type. All of the algorithms assume that the system has a tree topology, with a root complex at the top and a global address map managed by the root complex:
- Address based: Base-and-limit registers associate address ranges with ports on a PCIe switch. There are three to six sets of base-and-limit registers for each switch port.
- ID based: Each PCIe switch port has a range of bus numbers associated with it. Packets are routed according to their bus number, device number, and function number.
- Implicit: PCIe Message packets make use of “implicit” routing, where the routing is determined by the message type.
PCIe has evolved to support “non-transparent bridging”, which allows separate root complexes to send transactions to each other. Typically, non-transparent bridging requires packet addresses and bus/device/function numbers to be translated in order to resolve conflicts between the global address maps of the different root complexes. There are no standards for implementing this translation capability.