CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Debugging: Making the move from parallel to high speed serial trace
Andre Yew describes the history trace debug and describes the evolution of High-Speed Serial Trace (HSST) and discusses how it replaces conventional parallel trace, especially as CPU speeds and System-on-Chip integration complexity increase.



Embedded.com

What HSST brings to the game
Just as demand for increased bandwidth in other technologies has driven their transmission channels to high-speed serial channels, trace is on the verge of replacing fast, wide parallel channels, with significantly faster, fewer serial channels.

Hard drives have switched to Serial ATA, and consumer high-definition video basically requires HDMI, both of which use similar transmission protocols as the various high-speed serial trace proposals.

Increasing bandwidth makes high-speed parallel protocols more expensive and difficult to implement. For example, interchannel skew is difficult to control across 20 fast channels, and requires expensive cabling to guarantee performance.

Most trace collection probes today use a micro-coaxial ribbon cable from Precision Interconnect, which we buy for well over $100 for modest quantities of very short lengths.

As speeds increase, crosstalk between channels of a parallel interface also increases. Again, we use heroic $100-per-foot cable to solve this as well as adding even more conductors for ground lines between each signal line.

Switching transient current draw for many high-speed lines is enormous, and causes ground bounce due the finite resistance of conductors. These transients cause glitches that corrupt data. We inadvertently encountered this phenomenon during the development of the SuperTrace probe, a high-speed 1 GB trace collection probe.

We discovered that during certain operations, very infrequently, we would get corrupted data. After spending a few days trying to figure out what was going on, we finally realized that our highest-speed logic had been placed into a corner of the FPGA that had the fewest ground pins.

After re-routing the design for a ground-rich corner of the FPGA, we no longer had data corruption. As CPU speeds and parallel trace port speeds increase, problems like this will only become more common, and more difficult to solve.

More important for ASIC designers is the large number of pins required by parallel trace. While 20 pins may give the best performance from an ARM trace module, designers can barely afford less than half of those number of pins, which can significantly hamstring the performance of the trace port. With an abridged trace port, you may be lucky to get uninterrupted trace of the program counter, and data trace may be impossible.

Developers are forced into an impossible dilemma: do we give up enough pins so the chip will fit and meet its budget, or do we give the software developers (who are often the bottleneck of any electronic product) good enough trace facilities, so the product isn't held back from production for months by obscure bugs?

HSST solves bandwidth by using fewer channels, but running them far faster. Fewer channels means fewer pins, and lower power requirements. Because the data is wrapped into a serial channel, each with its own embedded clock, interchannel skew is no longer a problem, and noise susceptibility and emissions, both important for complying with EMI standards, are greatly reduced.

If more than one high-speed serial channel is used, skew still isn't a problem because multiple serial channels can be bonded to guarantee certain skew specifications.

Serial channels also use some kind of encoding scheme to balance DC and to provide enough transitions for clock recovery. The so-called 8b10b encoding used in Gigabit Ethernet, for example, where 8 bits are encoded to 10 bits in order to equalize the time the wires spend at 1 and 0, is currently the front-runner for HSST. However, 8b10b encoding incurs a 20 percent bandwidth overhead, so a 4 Gigabit-per-second channel has 3.2 Gb/sec of useful bandwidth.

Serial channels under consideration include Xilinx's RocketIO, which can go as fast as 6.25 Gb/sec. Current discussions with various customers, vendors and standards committees include proposals for using 4 of these channels for an aggregate bandwidth of 25 Gbit/sec, which we believe will cover almost all trace needs for at least a few years. For comparison, the highest bandwidth parallel trace ports currently in use are less than 8 Gbit/sec.

1 | 2 | 3 | 4

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :