With each turn of Moore's Law, designers at every phase in the development process are challenged with new levels of complexity. Chip designers must not only get the integrated circuit (IC) logic, performance, power and yield right on first silicon.
In addition, system developers must extend that first time success to the board-level reference design, boot code, operating system (OS) port and application software, not to mention flash programming, manufacturing test methods, and field support. Meanwhile, meeting tight deadlines and cost objectives have never been more challenging.
As the technologies of hardware and software development have evolved, so has the nature of system test and debug. Not only has Moore's Law impacted processor and memory design, but analogous changes have also taken place in integrating system functionality, the integration of operating systems (or real time operating systems (RTOS)) into embedded systems, and the amount of applications software bundled into the end product.
Let's roll the clock back. In the days of large scale integration (LSI) components (at least 1000 gates) the central processing unit (CPU) was only the CPU, bound by the limits of the semiconductor process technology, power consumption and yield available at that time.
Debugging was largely confined only to the process of integrating hardware functions, adjusting inter-chip timing and verifying logic functionality. System software was on the order of hundreds of kilobytes and limited by how much code would fit within the budgeted memory space.
How times have changed. Today, both memory space and high CPU clock frequencies are “almost free.” The design, debug and test challenge has evolved from hardware component integration and debug to high-level simulation of the system hardware design prior SOC tape-out, before moving on to system software modeling, and then on to integration of hundreds of megabytes (MB) of system software.
At system level, most of today's embedded devices require full-featured embedded operating systems running multiple application programs managed with memory management units (MMUs), and increasingly also provide some sort of networking capabilities such as local area networks (LANs) or wireless network connectivity for control or network access to databases or voice, data and video services.
At the heart of contemporary embedded solutions are systems on chip (SOCs) with millions of gates, integrating cache, scratchpad random access memory (RAM), and peripheral functions onto one chip; many component interfaces are buried within the chip and no longer available at the pin or board level for system test/debug. The CPU core is now running at hundreds of megahertz (MHz), and has integrated many of the hardware interfaces, so historical test methodology is no longer possible.
JTAG to the Rescue – Boundary Scan Testing
The Joint Test Action Group (JTAG) began solving board-level test problems in the 1990's by standardizing a serial scan chain method (JTAG; IEEE 1149.1) for accessing on-chip resources and additional shift registers built into the I/O paths of every IC for boundary scan testing.
Before the emergence of boundary scan testing, debugging of potential solder bump issues underneath a chip assembly was difficult. Prior to board assembly, every IC is tested to assure its flawless operation. Thus, if the assembled printed circuit board PCB does not work properly, the malfunction must be caused by a solder bridge, gap or a flaw in the printed circuit board. But what if the flaw is underneath the chip assembly, where it can't be seen or repaired easily?
The boundary scan testing methodology addresses this issue. As illustrated in Figure 1, a serial scan path through I/O registers was added and exercised by a sophisticated test program unique to each board to help identify a faulty chip or other device, so that these can be reworked or replaced. In the diagram in Figure 1, each grey box represents a category of device function, e.g., flash, peripherals, I/O ports, etc.
|Figure 1. JTAG connection used for boundary scan testing|
The JTAG approach provides a method to test very complex systems, while keeping the pin count low. Specifically, the IEEE1149.1 specification requires only 5 pins for the JTAG connection, no matter how long the scan chain register path is. The standard pin functions for the JTAG Test Access Port include:
TRST Test Reset (output from JTAG probe to chip to reset JTAG test logic)
TCK Test Clock (output from JTAG probe to chip to set JTAG scan rate)
TDI Test Data Input (serial test data input to chip)
TDO Test Data Output (serial test data output from chip)
TMS Test Mode Select (determines run or debug mode by state at TCK rising edge)
Several companies focus almost exclusively on boundary scan testing, specializing in both the JTAG hardware connection devices and host-based test software tools to adapt the test program to each board design.
The 2nd Role of JTAG – CPU Core Access for Software/Hardware Debug
Given that the CPU processor core is now hidden from observation or control by integrated caches in the core, by local on chip busses, by an MMU that dynamically allocates memory, and by other SOC peripherals and I/O blocks, the JTAG path provides a direct connection into the debug logic inside the CPU. Thus, we now have a means of observing and controlling program execution. Since caches and peripherals have moved on chip, so must the debug logic (Figure 2 below).
|Figure 2. JTAG connection use for software debug/development|
With this direct core access, host-based debugger software can now assert a “debug exception”, redirecting the processor to get the next instruction from the debug logic registers instead of the program counter, thus effectively taking control of the processor to perform software debug operations:
* Run-control: Start, Stop, Single-Step, Step Into/Over (source or instruction)
* Set hardware and software breakpoints
* Specify conditions to be met or scripts to be executed at breakpoints
* Control reset and initialization of the target system
* Download code to be debugged or code to be programmed into flash
* Execute flash programming and other semi-hosting utilities
Note that in both of the above applications, boundary scan and software debug, the role of JTAG is only to provide the physical layer communications interface, analogous to the PHY layer in the ISO Open Systems Interconnect model.
The protocol for what debug functions are supported is embodied in the debug logic, designed into the CPU core and the debugger software capabilities running on the host computer.
JTAG Debug Advantages
The primary advantages of using a debugger with JTAG access are:
* The JTAG connection provides direct access to the otherwise hidden CPU core
* The JTAG interface consumes no system I/O ports (serial, Ethernet)
* The JTAG debug method uses little or no system memory allocation (as in monitors)
* There is no monitor to crash along with a system crash (not useful at board bring-up)
* The JTAG connection does not require target system power (except some USB-only probes)
* A JTAG debugger can “steal cycles” to read registers/memory without stopping CPU (assuming that the debug logic built into the CPU provides this capability)
* A JTAG debug session can reset and/or initialize the system (Note: System reset is not part of JTAG. Rather, it is an adjunct to using JTAG for remote debugging, enabling a remote reset of a JTAG probe and target over a network.)
* A JTAG debugger can connect to the debug logic without perturbing the system
* Provides the only reasonable means to connect to targets that do not yet have working bootcode or I/O drivers
JTAG Debug Limitations
The JTAG debug connection does not solve all the world's debug problems because of some serious limitations:
1) Code download over JTAG is not the fastest way to download large programs (>20MB), especially for target systems that rely on 10/100BaseT Ethernet access.
2) Multicore system debug where multiple CPU cores are daisy-chained on the same scan chain and can be individually accessed, but implementing a synchronous debug operation requires additional on-chip hardware to circumvent skidding associated with JTAG operations.
Subsequently, hundreds of CPU cycles may go by after an asynchronous JTAG stop command is issued. Examples of these capabilities are now beginning to appear, e.g., the global inter-processor control logic in Cavium Networks Octeon family, with up to 16 64-bit cnMIPS cores.
3) “Printf” still provides an easy complement for extracting a variety of debug status reports.
Other Debug Functions
Not all debug functions use the JTAG scan chain, per se. For example, some processors include the capability to collect trace information, an extremely valuable debug tool, since it can follow execution through branches and interrupts, often saving hours or days searching for particularly illusive bugs.
In the first generation of trace implementation, a second block of debug logic either off-loaded trace data at the pipeline clock rate (off-chip and off-board) for collection in the same debug probe used to control JTAG operations.
Neither the original ARM Embedded Trace Macrocell (ETM), nor the PCTrace component of the MIPS EJTAG Debug Logic use JTAG to upload trace data. Both use JTAG to set modes of operation, but each has their own separate parallel bus, trace protocol and clock line that off-load trace data while the CPU is running.
This sets the stage for three limitations: (1) a maximum frequency to reliably collect valid data, (2) the costs of adding up to 20 additional pads on the chip, pins on the package and doubling the debug probe cost, and (3) no practical way to implement trace collection in SOCs with multiple CPU cores.
On-Chip Trace Buffers
The next phase of technology development for trace uses an on-chip trace buffer to collect data on the flow of program execution. This approach results in two-fold benefit. First, it enables the collection of trace data at the maximum CPU clock rate. At the same time, this methodology will minimize cost, since the trace data is uploaded over the same JTAG scan chain path/pins used for normal “run-control” debugging.
ARM's new Embedded Trace Buffer (ETB) and MIPS' new PDTrace using Trace Control Block (TCB) provide comprehensive trace information for subsequent host processing, albeit at increased chip die area and cost.
Intel XScale applications processors (IOP, IXC, IXP and PXA) have used a creative variation on this theme in their implementations. Instead of collecting large amounts of trace status information for every instruction, the branch data and branch count data are collected in a much smaller on-chip trace buffer. From this data, the execution flow can be reconstructed, except that no real-time stamp is available.
There are trade-offs here in the amount of trace data that can be displayed, based on both the technique used and the size of the on-chip buffer. So, the usual cost versus benefit trade-offs apply. Some amount of trace information is always better than none, and in typical debug scenarios, one should consider how much is really enough.
While extreme amounts of trace information provide a means of profiling code execution without “instrumenting” the code (which may affect performance), there is also a step up in development tool cost associated with it.
|Table 1: Applicability and benefits of JTAG throughout the development and product life cycles|
Other JTAG Applications
In addition to hardware/software debug and boundary scan testing, the JTAG connection is also being used by various tool providers for direct in-circuit flash programming, sending command scripts for execution on the target board, programming field programmable logic devices (FPLDs), and providing similar specialized debug capabilities for other functional blocks such as digital signal processor (DSP). Many sophisticated ICs include built-in self test capabilities that are initiated and post analyzed via JTAG.
Since the JTAG scan path provides direct core access, the benefits of applying JTAG methodology span from chip design through all phases of system development and product lifecycle of embedded devices. As shown in table one, there are considerable value considerations behind applying the JTAG methodology to each of the major life cycle phases–from initial validation of the chip design, prior to tape-out–to the implantation of system field support diagnostics and software updates.
Lyle Pittroff is project marketing manager, accelerated technology at Mentor Graphicsand was the vice president of marketing at Embedded Performance, Inc., before it was acquired by Mentor.