Achieving independent spread spectrum clocking with clockless PCIe - Embedded.com

Achieving independent spread spectrum clocking with clockless PCIe

PCI Express (PCIe) has established itself as the IO interconnect of choice for communication within the server and PC environment.  Today, an emerging trend among designers is extending PCIe beyond the PC/server while maintaining the advantages of simplicity, bandwidth, scalability, low power and cost.  One of the major system-level challenges in extending PCIe outside the box has been clock distribution between separated domains.  

While many PCIe devices can operate asynchronously, these applications use constant-frequency clocking.  The challenge gets more complicated when spread spectrum clocking (SSC) is needed. In systems required to operate with SSC, the only available option has been clock isolation. 

This method adds complexity in component count, clock fidelity, and media selection.In addition, the cable itself must operate in a constant frequency clock (CFC) domain.This CFC domain can still represent a significant source of electromagnetic interference (EMI).  

Let’s look at the importance of introducing independent SSC operation to the ecosystem of PCIe and some of the cross technology advantages in performance, simplicity and cost that it can deliver.

Spread Spectrum Clocking

Spread spectrum is the process by which the system clock is dithered in a controlled manner so as to reduce peak energy content. SSC techniques are used so as to minimize EMI and/or pass Federal Communications Commission (FCC) requirements. While the overall energy is unchanged, the peak (tonal) power is reduced. The amount of peak energy dispersion is dependent on the modulation bandwidth, spreading depth and spreading profile.

In the case of PCIe, the typical modulation profile is a 30 KHz-33 KHz, 0.5% down-spread clock. The modulation profile can be several types, but typically ends up triangular. When extending high-speed data outside of an enclosure, copper cabling can significantly increase the amount of peak radiated energy.Systems designers must either modulate the data exiting the box or resort to more costly cables with a high shielding index. In the case of PCIe, until now the option of modulating the data traveling outside the box was not available.

How PCIe Deals with SSC

PCIe, fundamentally, is a short-reach, point-to-point-protocol that is typically synchronous. Under these conditions, spreading the system reference clock has minor impact on the overall links; each device undergoes nearly the same frequency deviation in approximate lockstep.

To extend modulated clock architecture beyond the confines of the box, cable provisions must be made for sending a clock signal as well as data.In addition to added cabling cost, it increases complexity in terms of not just buffers for maintaining clock fidelity, but also clock timing correlation between the transmitting and receiving devices.(The PCIe specification provides recommended relative trace delay timing so as to keep correlation between transmitter and receiver).

Additionally, if systems with separate master clock domains need to communicate–two independent servers each with their own CPU clock for example–passing a clock between two master devices will not work.

As mentioned above, while PCIe devices from vendors such as PLX have been used in asynchronous applications for several years, separated domains require the use of constant frequency clocking or the use of a PCIe switch feature called SSC isolation. With this feature, the system runs with SSC, but the cable does not.

A version of this article was published previously on Embedded.com’s sister online publication Communications DesignLine.

See more articles and column like this one on Embedded.com.Sign up for the Embedded.com newsletters . Copyright © 2013 UBM–All rights reserved.

Spread spectrum clocking: independent vs isolated
While users might be familiar with the SSC isolation capability, independent SSC is significantly different and a potential boon for systems designers.In SSC isolation, the clocking structure is doubled on each side of the link. One domain is preserved for the system.

A second domain is required to transition between system boundaries. This transition boundary typically consists of an additional CFC clock chip on each side of the external channel.The constant frequency allows each side of the link to operate asynchronously. While this functionality has proven highly valuable, several disadvantages arise:

  • In the case of copper connections, radiated EMI can be significant, precluding the use of lower-cost cable options to meet FCC requirements.
  • Multiple clock domains must be managed – one for SSC, another for CFC, thus increasing design complexity and component count.
  • Often in uncontrolled system environments, illegal SSC modulation is found; typically systems will be found to be center-spread, rather than down-spread, representing a significant compatibility problem and difficult system management when looking to define the proper SSC/CFC clock values.
  • In systems that can operate as SSC only, increasing cable provisions for an additional clock signal means added cost in cable, buffering and clock conditioning.

Having PCIe with Independent SSC operation alleviates these concerns. As suppliers of PCIe switches, endpoints and hosts move forward, it is critical that clock simplification takes place. With PCIe features such as SSC Free PCIexpress, neither clock management, additional clock chips and/or buffers nor protocol translations are needed – just simple scalability and connectivity, lower connection costs and higher density.

A conceptual PCIe reference implementation highlights one of many potential flexible consumer designs, all geared towards extending storage, running remote graphics or providing an expansion platform for the vast number of PCIe devices on the market.It reflects how independent SSC can build upon today’s connectivity solutions, by reapportioning two alternate connectivity solutions targeted at distance and cost for PCIe purposes.

With simplicity and high density as objectives, the implementation achieves connectivity via an optical port (in this case, a dual x2 Avago McLink modules with optical USB connectors providing 32Gbps) and a copper port (single Molex x4 Mini-SAS HD SFF-8644 connector and cable providing 32Gbps).

Neither of these solutions employ additional structures for reference clock transmission. However, each represent the type of port density, cost and/or reach expansion that PCIe has begun to target.As demonstrated by this basic reuse, clock simplification is critical to reaching the objectives of an expanded PCIe ecosystem and improved economies of scale.

Independent SSC Demonstration

Figure 5 shows the two five-slot expansion boards. One card is enclosed in a standard ATX CPU case, while the other is left open for clearer observation. There is no CPU in either expander card – only a Gen3 switch with upstream and downstream configurable slots. (Data is provided to each expander via the server platform reference card, shown in Figure 4 .)

To the left of the switch are two daughter-card host assemblies.Either of these host assemblies can be configured to operate as an upstream port, with the remaining port useable for cascading of expansion cards.

An interchangeable range of daughter cards, as shown below (Figure 6 ), make for a flexible demonstration vehicle to interconnect between alternate connection options. The SSC modulated clock is created from an external Texas Instruments CDCE925 evaluation board and inserted via SMA to the expander card configured for Molex Mini-SAS connectivity.

The onboard CFC clock is disabled. The second (optical) expander uses the on-board CFC clocking as system reference. As a result, this demonstration shows three separate clocking domains operating independently through the upstream switch–CPU SSC, copper expander SSC, and optical expander CFC–all without separate clock-management configurations. SSC is preserved on the copper cable.

What’s significant about this demonstration are:

  • Low-cost Avago x2 McLink modules create simple 16Gbps PCIe bandwidth/ USB connector.
  • Molex Mini-SAS-HD (SF8644 6Gbps) cabling is capable of supporting PCIe Gen3 (x4 lane width) at two meters and beyond.
  • Independent clocking opens the door for wider, lower-cost connectivity options, which can benefit from cross-protocol economies of scale.
  • Only the link between the expansion box and the server is required to carry traffic at Gen3 and will transmit/receive on separate SSC domains; while independent SSC is not an industry standard, this highlights the direction and standards to which PCIe devices must migrate.
  • The clock-less link reaches Gen3 operation via the standard PCIe linkup progression from Gen1 to Gen3.
  • All other PCIe cards inside the expansion bay or in the server can operate with the native spread clocking and at any PCIe (Gen1, Gen2, or Gen3) negotiated rate.

The Result

In the demonstration, both 0.5% down-spread and center-spread clock modulation was used at the expansion board and shown to have no difference in link integrity.While the PCIe specification calls for modulation frequencies to not exceed 33 kHz, at 0.5% down-spread, the TI synthesizer had a fixed modulation of ~ 30Khz.

Interestingly, the CPU box was tested and found to have 0.5%, center-spread modulation.While the PCIe specification only allocates for down-spread clocking, rather than look for another PC, this clocking was used (an indication of what can happen with real-world management of disparate systems).

The first two plots (Figure 7 and Figure 8 , above) show the SSC on each side of the link. The motherboard PCIe clock modulation was observed via SMA breakout card from a PCIe slot (not shown).A spare output buffer on the expansion board shows the synthesizer SSC used on the copper configured expansion card.

In the absence of more costly time interval analyzer tools, the scope (Figure 9 ) plot shows a simple means to verify a modulated reference.Here, the scope is set to delay trigger mode, with approximately 450ns of delay. With delay enabled, the time between scope triggering and scope sampling result in random placement of the waveform. 

In the absence of SSC, the delayed trigger output and the non-delayed output will have similar profiles because the deviation of the waveform is small.In this example, the spread modulation has resulted in significant clock width increase, which is indicative of SSC being enabled.

The Result

Figure 10 and Figure 11 are internal eye monitor measurements of the CFC optical receiver (USB) eye and the down-spread SSC copper cable (Mini- SAS-HD) eye, taken at the PCIe server card.Because the server has a center spread SSC domain, multiple link domains are observed: center spread SSC to down-spread SSC and center-spread SSC to CFC.

With the three systems operating on separate spread clock domains, no change in the link error performance or significant eye quality reduction was observed.

Key points of this demonstration are:

  • The link between the expansion box and the server is the only link required to carry traffic at Gen3 and will transmit/receive on separate SSC domains.
  • The clock-less link reaches Gen3 operation via the standard PCIe linkup progression from Gen1 to Gen3.
  • If operating over a copper link, EMI suppression would be preserved, thus reducing the need for higher-performance shielding or concerns about clock distribution and integrity.
  • For the optical interface, Gen3 backchannel tuning isn’t needed; fixed equalization sets the electrical portion of the full optical path.
  • All other PCIe cards inside the expansion bay or in the server can operate with the native spread clocking and at any PCIe (Gen1, Gen2, or Gen3) negotiated speed.
  • While an x2 optical McLink was targeted in this application for consumer consideration–effectively doubling the PCIe bandwidth over Thunderbolt–SSC Free can be scaled to higher-density PCIe lane options, optical and alternate copper connectivity solutions.

PCIe vendors working to make Independent SSC operation a standard functional feature of next-generation of devices, with the goal that the entire PCIe ecosystem will join, and benefit from, this trend.System clock management capabilities from for the PCIe market are now ready for prime time

Reginald Conley is vice president of applications engineering at PLX Technologu .He can be reached at .

This article was published previously in another form on Embedded.com' s sister online publication Communications DesignLine.

See more articles and column like this one on Embedded.com .Sign up for the Embedded.com newsletters . Copyright © 2013 UBM–All rights reserved.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.