In February 2007 the PCI-SIG approved the PCI Express External Cabling Specification 1.0 that defines how PCI Express can be implemented over a standard cable. This new capability allows the full bandwidth of the PCIe bus to be utilized within multiple chassis systems and small local networks.
Over the past year and a half, hundreds of applications have been implemented using PCIe over cable for high-speed I/O, bus expansion and local networking. In these applications, using PCIe provides the benefits of high performance, system simplicity and reduced costs.
Although PCIe is now well known as the PC backplane interface standard, it is much less known as a high-speed cabling interface. Previous parallel bus structures such as PCI couldnâ€™t easily be routed over a cable due to signal integrity problems. The serial technology and embedded clocking used within PCIe allows it to be used at full speeds over backplanes or cables.
The cable specification defines four cable connectorsâ€”for x1, x4, x8 and x16 linksâ€”providing a wide range of price and performance (Figure 1). At the low end, the x1 cable provides a 2.5 Gbit/s interface over a low-cost cable. At the high end, a x16 cable provides 80 Gbits/s over a more expensive cable. Adapter modules such as those shown in Figure 2 support most of the various cable sizes and are available in a number of form factors supporting desktops, laptops, CompactPCI, CompactPCIe, VMEbus, AMC and XMC. Both the PCIe cables and adapter cards are now available from multiple sources.
Inside the Cable
The cabled version of PCIe contains the same signals as the backplane version of the bus structure. These include the high-speed differential pairs that transfer data as well as a number of additional signals known as sideband signals.
The differential pairs are organized into groupings called lanes. One PCIe lane consists of four wiresâ€”one transmit pair and one receive pair of signals. Each pair of wires is molded together with the pair held side by side, closely together. Each pair is then individually shielded. This helps ensure that the wires stay length matched through the cable and that any incoming or radiated EMI is minimized. PCIe provides a wide range of performance by increasing the number of lanes within a link. A x1 â€”pronounced â€œby oneâ€â€”link includes only one lane. A x4 includes four lane sets, a x8 includes eight sets, and a x16 includes sixteen lane sets of wires.
The sideband signals provide additional functionality, but are not directly involved in the PCIe data transfers. In fact, it is possible to operate PCIe without any of the sideband signals present, although this would violate the specification. The sideband signals include:
â€¢ Reference Clock (CREFCLKp, CREFCLKn): 100 MHz reference clock, used to implement spread-spectrum clocking over the bus.
â€¢ Cable Present (CPERST#): Signals that the cable is connected between two systems.
â€¢ Platform Reset (CPERST#): Allows the upstream host to reset the downstream subsystem.
â€¢ Cable Power On (CPWRON#): Allows the upstream host to turn on the downstream subsystemâ€™s power.
â€¢ Sideband Return (SB_RTN): The electrical return for the sideband signals
â€¢ 3.3V Power (+3.3V POWER, PWR_RTN): Provides power to the PCIe cable connector to power active components within the connector. Note that these power connections are not bused across the cable.
Thus there are 10 wires contained within a x1 cable, 22 wires within a x4 cable, 38 wires within a x8 cable, and 70 wires within a x16 cable.
One can observe that there are no signals within PCIe for data clocking (clocking is embedded within each signal pair), transceiver direction (each signal is unidirectional), or arbitration (each link is point-to-point, so there is no arbitration). Each of these features helps enhance the performance of PCIe.
Gen 1 and Gen 2 Performance
One of the techniques used to create high performance with PCIe is the use of embedded clocking within each differential signal pair. With the clock embedded inside the signal stream, there is no distortion or time delay between the clock signal and the data. The clock rate used within each PCIe signal is either 2.5 GHz for Gen 1 or 5.0 GHz for Gen 2. PCIe interface components automatically detect the speed capability of the other side of the link, and only utilize Gen 2 timing if both sides are capable of that performance level. The performance of various PCIe lane widths is shown in Table 1.
These performance levels are identical whether the bus is routed over a backplane or over the PCIe cable. At this time, PCs with PCIe slots supporting Gen 2 timing are available, but not mainstream. I/O boards supporting Gen 2 timing are generally limited to high-end x16 graphics boards. This will inevitably change over the next few years, as more and more PCs and I/O boards support the faster Gen 2 timing.
To get around the various regulatory agencies restrictions on radiated emissions, PCIe over cable has come up with a clever technique called spread-spectrum clocking. Because PCIe contains the fundamental clock rate of 2.5 GHz or 5.0 GHz, a spike in emissions occurs at those frequencies. Depending on the chassis and cable shielding, this spike may exceed FCC and/or CE mark regulations. These regulations vary based on the intended usage of the system, with home usage the strictest.
The agency restrictions are currently defined such that a radiating device cannot emit energy spikes above a certain energy level at any one frequency. Spread-spectrum clocking takes advantage of the â€œone frequencyâ€ portion of this definition, and spreads the radiated emissions out over a narrow range of frequencies. By dynamically modulating the clock rate from 0% to -0.5%, the spike of energy becomes a duller â€œplateau of radiated emissions,â€ which currently isnâ€™t regulated. The reference clock signal is driven by the host side of each PCIe link to indicate to the target device how much the clock signals have been modulated.
Advantages versus Other Cable Standards
Whether being used for high-speed I/O, bus expansion or as a high-speed network, there are several advantages associated with PCIe over cableâ€”low costs, high bandwidth and software transparency.
These advantages derive from the fact that the PCâ€™s backplane bus is already PCIe. Thus, the cable adapter boards donâ€™t need to convert the protocol or change the speed of the signals. The adapters simply route the signals from the motherboard out to the PCIe cable connector and provide some signal conditioning to guarantee the signal integrity is met at the other end of the cable (Figure 3). Because these adapters are simple, they are inexpensive. Because they donâ€™t convert the PCIe protocol into anything else, they are high performance and donâ€™t require any software drivers for the I/O expansion models.
As an I/O expansion protocol, PCIe can be used in place of other protocols including: StarFabric, External SAS, Fibre Channel and USB. Compared to each of these, PCIe is the highest performance since there are no protocol conversions or timing delays. No cable connection can transfer data faster than the PCIe slot is capable of, and PCIe over cable is by definition at 100% performance. Any adapter that has to convert PCIe into a different protocol, send the data over the cable, and then convert it back to PCIe at the other end of the cable so that it can communicate with I/O devices, will necessarily be slowerâ€”both in throughput and latencyâ€”than a pure PCIe over cable solution.
For networking, the obvious comparison is PCIe versus 1 Gbit Ethernet and 10 Gbit Ethernet. 1 Gbit Ethernet is clearly the most well accepted and lowest cost solution. 10 Gbit Ethernet has so far relatively little acceptance, and continues to demand high prices, particularly for switches.
PCIe over cable is much higher performance and much higher priced that 1 Gbit Ethernet. PCIe compares favorably, however, to 10 Gbit Ethernet. In this comparison, PCIe spans a much broader performance rangeâ€”2.5 Gbit/s for a x1 Gen 1 up to 80 Gbit/s for a x16 Gen 2â€”than 10 Gbit Ethernet. The maximum distance for PCIe is only 23 feet, however, restricting its suitability for longer distance applications. In general, a PCIe network will perform twice as fast as 10 Gbit Ethernet and cost half as much.
The PCI-SIG cable specification doesnâ€™t specify any fiber-optic solutions for PCIe. However, several vendors have introduced products that utilize PCIe over fiber. The primary advantage of these products is that the fiber cable can span considerably longer distancesâ€”typically 500 meters.
Some of the challenges with fiber-optic solutions include what to do with the sideband signals, and in particular what to do with the spread-spectrum clocking. The solutions available so far have taken a pragmatic, albeit somewhat limited, approach to these issues. They simply do not include any of the sideband signals, giving up the usefulness of these signals. Without the reference clock, systems connected via fiber cables can only be used with PCs where spread-spectrum clocking can be disabled within the PC. Many server-class PCs have the capability of disabling spread-spectrum clocking as a BIOS setting, but many desktop PCs do not allow this.
PCIe as a Network
PCIe was originally defined to support CPU-to-I/O communications, with the basic PC serving as the controller host. Multicomputing can be accomplished using PCIe through a combination of non-transparent bridging and CPU-to-CPU communications software. This technology expands the applicability of PCIe to a wide variety of high-end applications, including radar and sonar analysis, medical imaging, test and measurement and communications equipment. A small networking architecture is shown in Figure 4.
Special hardware and software drivers are required in order to utilize PCIe over cable as a network connection between multiple PCs. In the networking model, CPU components that normally initialize and control their local PCIe buses are interconnected. If each CPU tried to initialize and control the PCIe bus, the system wouldnâ€™t work. The solution is to use non-transparent bridge technology on the cable adapter boards to isolate each side of the bus. With non-transparent bridges, each CPU can control its side of the bus, but not interfere with the portions of the bus controlled by other CPUs. Another feature required for networking over PCIe is the ability to isolate the spread-spectrum clocking coming from both sides of the non-transparent bridging. This can also be accomplished within the cable adapter board.
The PCI-SIG recently announced the basic timing and performance for Gen 3 PCIe. The individual lane performance for Gen 3 will double again, increasing transfer rates to 10 Gbits/s per lane. Gen 3 products are expected to become available by 2011. Backward compatibility will be achieved by each bus interface component starting the bus training cycle using the Gen 1 timing. If both sides of the interface are compatible with Gen 2 or Gen 3 timing, they will both shift to the higher performance.
The ability to run PCIe over cable at full performance with complete software transparency opens up a range of new application possibilities for CPU to I/O system re-partitioning. Low-cost host bus adapters extend the PCIe bus structure to expansion chassis or dedicated PCIe I/O hardware. PCIe over cable provides a simple and low-cost method for extending applications that need more I/O boards than will fit in a single chassis to a multi-chassis solution. PCIe over cable can also be used as a high-performance peripheral connectionâ€”a super-fast USB of sorts. Designing compatible end-points is straightforward because the PCIe interface is available as a gate array library.
When CPU-to-CPU communications are added to PCIe, the cable interface can be used as a high-performance cabled network. A x8 cabled network with Gen 2 timing will transfer data at 40 Gbits/sâ€”or 40 times faster than todayâ€™s 1 Gbit/s Ethernet interfaces.
One Stop Systems