Using RTOS-aware debugging and Serial Wire Viewer to debug Cortex-M3/M4 designs
Two relatively new developments in debugging technology are RTOS- or kernel-aware (KA) debugging, and Serial Wire Viewer (SWV). KA is generic and processor independent, while SWV applies specifically to ARM Cortex microcontrollers. This article is specific to the Cortex-M3 and -M4 processors and examines the complementary nature of these technologies. Traditional run/stop debugging has been greatly enhanced in the ARM Cortex-M architecture by the introduction of Serial Wire Viewer and other capabilities of the CoreSight debug architecture.
Kernel Aware debugging views have evolved and matured in the past decade and are now considered indispensable in satisfying the information requirements of software testers for general information on high level RTOS objects. However, a limitation of KA is the necessity to halt the processor to gain information. Without purchasing additional tools, basic evolution of certain important RTOS constructs in real time can be quickly and easily monitored using SWV. The combination of KA and SWV now provides insight into embedded applications that neither method alone was previously capable of providing.
Historical notes on JTAG debugging
In the late 1990s the JTAG (Joint Test Action Group) interface emerged as a replacement for the in-circuit emulator (ICE). The latter was running out of steam as controller clock frequencies increased, with only a grab bag of techniques ranging from blinking LEDs/printf() style debugging through ROM monitors, ROM emulators, and various background debug mode implementations (BDM). The JTAG interface, originally developed for boundary scanning of parts, quickly became the defacto method for hardware debugging in embedded design. It solved the limitations of ICE by incorporating three distinct building blocks:
- On-chip IP blocks that facilitate transmission of debugging information and control to the external debugging system on the host PC
- A debug probe that is connected between the host PC and the target that translates information from the on-chip IP to the communications channel (UART, USB or LAN) on the host PC
- An application on the PC to send/receive information from and control the device under test, and display it visually
Traditional JTAG debugging tools included mechanisms to run the application, run to a specific line of code, step, and hit breakpoints. In some cases, ancillary software extended the number of breakpoints useable for flash memory. Memory locations could be displayed and modified when the application halts. An extension of the on-chip IP called embedded trace macrocell (ETM) allowed instruction and data tracing. This was normally reserved for high-end parts.
Early in the last decade, ARM Ltd introduced a CoreSight specification for ARM Cortex microcontrollers that greatly extended the capabilities of traditional JTAG debugging. However, the focus of this article, the ARM Cortex-M3 and M4 microcontrollers, do not have the full implementation of the CoreSight architecture. This was done to preserve the basic low-cost framework of the Cortex-M family of parts. To retain essential debugging power in return for an attractive cost savings, ARM’s Coresight incorporates the Serial Wire Viewer (SWV).
The practical benefits of ARM CoreSight
There are a number of practical benefits conferred to ARM Cortex-M3 and M4 processors by this development, including:
- Fewer pins required on die than traditional JTAG
- Reporting data in real-time as the application is running without incurring penalties in run-time behavior
- Intrusive but inexpensive printf() style debugging via instruction trace macrocell (ITM) channel
- Uses inexpensive debug probes
- Works during low power modes
- Statistical profiling of applications possible
- Reporting of interrupts and events possible
- Easy timing measurement of code sections possible due to time stamping capability
However, there are a few drawbacks and limitations which must be borne in mind, namely:
- Data transmitted through SWO pin (which is a 1-bit interface) , some throughput limits on information (4-bit interfaces available on certain parts, but cost of debug probe increases sharply).
- Real-time data monitoring is limited by number of comparators on die, typically four
- Instruction trace possible in parts with 4-pin interface, but no data tracing
To compare and contrast a Coresight capable debugging environment with a traditional JTAG run/stop capability characteristic of say, ARM7TDMI parts, examine Figures 1 and 2 below.
Click on image to enlarge.
Figure 1 shows windows such as variables, breakpoints, memory, CPU core and special function registers.
Click on image to enlarge.
Figure 2 shows considerable additional information including:
- Real-time (without core overhead penalties or intrusion) plot of memory locations (X-Y-Z accelerometer variable values in this case)
- Statistical profile of where the application is spending its time
- A printf() output via ITM from the accelerometer servicing tasks
- Time indexed record of all system interrupts and exceptions with their descriptions
- Interrupt and exception statistics
An essential feature of this information is the way that it is displayed. At a glance, developers can quickly visualize the “health” of the application, and address questions such as:
Is the data from sensors evolving in real time as expected when stimulated?
Are interrupts firing on time, and in the correct sequence?
Is the application building up excessive amounts of processor time in certain functions or regions of the code?
These are extremely valuable parcels of information once all the various components of the application, e.g. peripheral drivers, application code, RTOS and possibly TCP/IP stack, embedded flash file system and USB stack, are bound together. Problems in one component of the code can easily be caused by problems in another, seemingly unrelated area. In this case, with raw data efficiently turned into information we see a variety of potential problems can be ruled out with simple and quick visual inspection.
Using an RTOS in embedded applications and its effect on debugging
Including an RTOS in an embedded application has a number of advantages and disadvantages that have been extensively discussed over the years in many articles, whitepapers, forums, and development groups. In recent years, the increasing cost/performance/resource ratios of processors like Cortex-M3/M4 has lowered the barrier to RTOS adoption.
Because the use of an RTOS was problematic in the days of in circuit emulators, in any quarters it is still regarded as an additional layer of difficulty and effort added to the debugging problem, even though contemporary debugging tools such as JTAG and Serial Wire Viewer work reliably with RTOS applications. The correct way to assess the impact of RTOS usage on the debugging problem is to consider what kind of information is required when debugging an RTOS based application. Every RTOS implementation is characterized by the following:
- Breaking the application into discrete tasks managed by the RTOS
- Use of an API for creating, managing and possibly deleting tasks
- Use of various mechanisms for inter-task communication and synchronization
- Use of RTOS provided time management capabilities