With debugging features such as complex breakpoints, trigger sequencer
and processor state storage now available in smaller 16-bit
microcontroller architectures, the hunt for the pesky bugs in
real-world applications, under real-world operating conditions, becomes
more than just a child's game.
Debugging is an essential part of an embedded developer's daily
life. As applications become more complex, the debug process becomes
more challenging and time-consuming. The drawback of using simulator
solutions for debugging has is an inaccurate reproduction of
application operation conditions.
Many simulators only simulate the CPU core and a few peripherals,
but provide features such as code coverage, profiling and conditional
breakpoints. While this may suffice for testing complex software-only
algorithms, embedded applications normally involve a fair amount of
hard-software interaction.
Real-time events occur, caused by externally connected circuitry and
microcontroller (MCU) integrated peripherals such as analog-to-digital
converters (ADC) and timer modules.
Also, the MCU can be part of a more complex system, and be
interfaced with external control and data storage ICs or used in
combination with a digital signal processor (DSP) to provide support
functions. Software for communicating with these external ICs cannot be
easily debugged and verified in the "clean room" environment of a CPU
core simulator.
One very popular way to address these drawbacks is through the use
of in-circuit emulators (ICEs). These combine the benefits of a
simulator with the ability to run software on actual hardware in
real-time.
This is achieved by replacing the MCU on the target board with a
chip-package adapter that is "hooked up" to a box of complex electronic
circuitry that mimics the MCU's behavior and provides debug
functionality.
Disadvantages of off-chip ICE
While usually very feature-rich, however, these emulators are
expensive, cumbersome and do not provide a 100 percent accurate
environment. This is especially true when it comes to sensitive analog
peripherals such as A/D and D/A converters, comparators, oscillators
and voltage references.
Given these disadvantages, embedded emulation, which adds debug
functionality to the MCU, has become very popular. Through embedded
emulation, the host PC used for debugging can directly interface to the
on-chip emulation logic using a serial interface like JTAG.
The application code runs on the MCU as it would without a connected
debug interface. The advantage lies in that because the software
already is developed under real-world operating conditions, no
additional hardware testing is necessary after software development,
Today embedded debug functionality is common even in smaller 16-bit
and 8-bit controller architectures. But, most implementations only
offer a basic set of functions like memory access, CPU execution
control and hardware breakpoints.
This leaves much to be desired when compared to a full-blown
in-circuit emulator (ICE). The latest 16-bit modern MCU models go one
step by providing a greater set of features to bringing functionality
closer to what one is used to getting from an ICE.
In many cases, this can save developers the expense of an ICE while
tackling even the toughest debug scenarios.
Using an on-chip emulation module
To illustrate the capabilities and advantages of such built-in
capabilities, use the enhanced emulation module (EEM) of the MSP430 MCU
as an example. As shown in Figure 1,
below, this module incorporates the following functional blocks:
basic trigger inputs, trigger combination logic, trigger sequencer,
trigger action, state storage and clock control.
 |
| Figure
1. MSP430 Enhanced Embedded Emulation Module |
For debugging with the such an on-chip module, at least one of the
eight available basic trigger inputs must be configured. Commonly,
on-chip embedded emulation implementations only allow setting triggers
that monitor the CPU address bus (MAB) and stop the program execution
at a given memory location.
Triggers found in some modern-day MCUs such as those in the above
example, however, allow significantly more complex setups. In addition
to the address bus, any triggers can be configured to monitor the CPU
data bus (MDB), the internal CPU registers and some of the processor
control signals.
Furthermore, it is possible to apply bit masks to address and data
bus triggers to isolate certain values of interest if desired. Using an
additional constant, comparison options such as "equal to," "not equal
to," "less than," and "greater than" can also be applied. The
combination of these features enables a sophisticated breakpoint setup
versus the standard "break-at-address" debug functionality.
The trigger combination logic forms complex triggers out of the
basic trigger inputs that are available. Trigger events output by this
block are user-definable, logical AND-combinations of the basic trigger
inputs. For example, by combining an address with a data bus trigger, a
memory location can be monitored for certain values that are written or
read there.
This complex trigger event can then be used to directly stop a
program execution or generate a state storage event. Prior to that
function, trigger events can be processed by the trigger sequencer,
which is a built-in state machine that has four states.
Programmable transition conditions made out of incoming triggers are
used to generate transitions between the states. When the sequencer
reaches the final event state, the MCU can be configured to stop
program execution and/or generate a state storage event.
As mentioned earlier, any of the complex triggers can also be used
by the state storage unit. This is a circular buffer that holds up to
eight entries, each of which is a snapshot of the 16-bit address bus,
16-bit data bus and some important CPU control signals from the time
that the trigger occurred.
It can be seen as a simple trace buffer, which is capable of
capturing status information without affecting the real-time behavior
of software running on the MCU. System snapshots can be taken on a
basic trigger event, either by a combination of triggers, by the
sequencer output or simply on every CPU clock cycle.
A useful add-on that comes with the EEM on some modern-day MCUs is
the clock control unit. Various peripherals such as an A/D converter,
LCD driver, timer, and serial communication module can be driven by one
of the three available internal clock tree signals.
When stopping program execution, the clock control unit allows a
per-module basis configuration of which peripheral continues to get
clocked, and for which the clock will be stopped when the processor is
halted during debugging. If the emulation module simply stops all
clocks (e.g., when a breakpoint is reached), unwanted side effects
could occur such as lost communication characters or erroneous A/D
conversion results.
Another possible implementation approach is to continue clocking
peripheral modules on emulation stop. A possible problem with this
solution, however, is that certain modules such as a timer could
permanently set interrupt flags even with the CPU halted, which could
making single-stepping through the source code quite challenging. Using
the clock control block, the developer can limit the clock distribution
to only the modules that are vital to the application.
EEM Triggers forming complex
breakpoints
Popular on-chip emulation modules offer the ability to set basic
hardware program breakpoints. However, debug scenarios can be
simplified in many cases if there is a way to add a condition to the
breakpoint. Let's look at an example: an embedded application written
in C implements a complex state machine. The current state is stored in
a global variable, and gets updated in various places throughout the
source code. The problem is that the application exhibits unexpected
behavior; state '3' is entered under the wrong conditions.
Now, the challenge is to find the section of source code that caused
this unexpected transition. A complex EEM module trigger can now be
used to stop program execution whenever the value '3' is written into
the state machine variable 'StateVar'. This
complex breakpoint is a logical AND-combination of two basic EEM
triggers, generating a CPU stop event. Figure
2 below shows a simplified block diagram of how this complex
trigger can be implemented.
 |
| Figure
2. Combining Basic Triggers |
One basic trigger is configured to monitor the device's internal
address bus (MAB) for the state machine variable 'StateVar'
address and the CPU write access control signal. The other trigger is
used to monitor the internal data bus (MDB) for the value of 3.
Configuring the EEM logic with this set-up allow the program
execution to stop exactly at every instruction that writes the value
'3' into 'StateVar'.
This way, the code section that caused this write access can be
identified easily. The same mechanism can be applied not only for RAM
access, but also for Flash and peripheral module access as well.
This complex breakpoint can be further configured by using the bit
mask feature, which is implemented into every basic EEM trigger. This
option can be described best by taking a look at a real-world example.
A customer using an ultra-low power MCU for a portable sports watch
application reports a problem: the general purpose port pin 3 of port 1
gets set unexpectedly. In the application, port 1 was used to control
various external circuitries and, therefore, was accessed multiple
times during the code execution.
The task here is to find out which CPU instruction and corresponding
line of source code caused this unexpected port pin modification. A
complex EEM breakpoint similar to the one just described can be used to
monitor write accesses to the port 1 output register.
Additionally, by using the bit-mask feature, bit 3 could be
isolated. This isolation is achieved by programming 0x0008 as the mask
AND-value, and 0x0008 as the trigger compare value. The MDB EEM trigger
hardware now performs a bit-wise AND-operation prior to every
comparison.
This way, the CPU is left running and the program execution stops
exactly at every line of code that set bit 3 in the port 1 output
register. After executing the program a couple of times, a buggy
C-expression is identified as the cause of the unexpected behavior.
An awful bug - caught!
A common mistake in embedded applications is stack overflow. Most MCU
architectures allocate space for stack in RAM. However, RAM is a
limited and shared resource also used by other variables and program
elements.
A common MCU practice is to set the stack pointer to the top of the
RAM space during program initialization. For C programs, depending on
the development tool used, the linker allocates a section of RAM with a
default size of 0x50 byte for stack use.
When developing software and adding an increasing amount of global
variables, the linker will eventually report there is insufficient
memory available at compile time. Now, the problem here is that if the
stack size has not been carefully specified, the reserved space may not
be sufficient for the application.
When using dynamic memory allocation at run-time and recursive
programming techniques, space can be easily consumed. Also non-ideal,
real-world events such as a bouncing button or other input signal
generating nested interrupts can push the stack pointer to the edge,
and ultimately over the edge. With no stack pointer run-time checking,
a developer runs the risk that the stack will grow into the range where
variables reside.
If this happens, vital application data can be corrupted with
hard-to-explain effects ranging from strange program behavior to a
total software crash. A mechanism that immediately stops program
execution when the stack pointer (SP) leaves the designated RAM area
could easily identify the problem.
Figure 3 below shows an
example of how to set-up a complex-breakpoint that monitors an MSP430
MCU SP. This and other screenshots are taken from the IAR Embedded
Workbench V3.20A IDE.
 |
| Figure
3. Stack Overflow Detection |
With this particular device, 0xA00 " 0x50 = 0x9B0
was used as the lower limit. Should this breakpoint now stop program
execution, the stack contents can be examined to determine the root
cause of the overflow. If some amount of identical values can be found
there, this might be an indication for a bouncing port interrupt issue.
Another form of complex breakpoints that can be formed are range
breakpoints. A range breakpoint monitors the address or the data bus
within or outside of a specified range. The debugger uses two EEM basic
triggers that are combined internally and are set to monitor the same
bus.
One trigger is configured for a "less than" comparison mode while
the other trigger is configured for a "greater than" comparison mode.
The program execution will stop only when these two conditions are
fulfilled. For instance, a range breakpoint can be used as monitor to
ensure that no CPU instruction fetch occurs outside of the program
memory.
This can help in debug scenarios in which the program counter
becomes corrupted due to an incorrectly calculated indirect jump. This
can also be combined with read/write modifiers, to protect memory areas
from overwriting.
Making triggers smart
Another real-world example of the power of the EEM involved a digital
still camera application, which used a MCU in conjunction with a DSP to
perform support functions such as keypad scan, power management and
real-time clock. Both processors were connected through a serial UART
link and the DSP sent multi-byte command sequences to the support MCU
to request the current battery status. In this example, however, the
MCU failed to react to the given command.
The developer faced this problem: how to introduce breakpoints into
the serial reception interrupt service function to determine the
problem root cause without disturbing the real-time behavior of the
application. The DSP software timed out, the communication was
interrupted and the data exchange never reached the interesting point.
The solution lay in feeding three complex triggers into the trigger
sequencer and using a setup like the one shown in Figure 4 below, to halt program
execution exactly after reception of the last byte of the 3-byte long
command sequence.
 |
| Figure
4. Sequencer Control |
From now on, single-stepping showed path of the CPU through the
program, and revealed why the command was never executed.
What is my program doing?
The state storage block is another powerful EEM component. When
debugging code and having the state storage module configured to
capture on every CPU instruction fetch cycle, the storage buffer
contains a history of the last eight assembler instructions executed
(8-level deep history).
When stopping program execution manually or through a breakpoint,
this list provides useful information as to what occurred prior to
stopping execution. When configuring this feature to collect data on a
basic or combined EEM trigger event, it is possible to log only certain
op-codes such as "jump" and "branching" instructions. This gives a
powerful instruction trace that allows inspection of the most recent
program flow.
Because the state storage buffer is accessible through JTAG without
interfering with the CPU and target application operation, another
useful function is available: the implementation of a real-time watch.
This watch can be helpful in many debug scenarios, such as a motor
control application, for example.
When breakpoints are inserted into the application to halt the
program and read out variables through normal watch windows, the
control algorithm is disrupted, which could possibly cause a breakdown
of the mechanical installation.
Combining one of the MCU's EEM triggers with the state storage
feature, it is possible to realize a real-time watch to monitor
application variables without modifying the program code itself. With a
trigger set to monitor a certain memory location containing the
variable of interest for write access, state storage events can be
generated.
The data-bus value transferred into the buffer now always contains
the last updated value of this variable. Figure 5 below shows a screen
capture of the State Storage Window as implemented in the IAR Embedded
Workbench.
 |
| Figure
5. State Storage Window |
In this example, a global variable that contains the current motor
speed is located at address 0x200 and monitored for write accesses.
With the motor control algorithm running in real-time, the state
storage window is refreshed automatically and displays the most current
motor speed in the "Data bus" column, and all without affecting the
application execution.
The screenshot shows an increase in motor speed. Even though this
mechanism of using data-bus values as a real-time watch is limited to
16-bit, it should be sufficient for most purposes.
Clocks under control
The EEM clock control block provided help during development of another
embedded application. One task of the MCU was to drive a MOSFET
switching transistor that was used in a boost-converter circuit to
generate a high-voltage. The transistor was connected to a timer PWM
output with the duty cycle controlled by a software algorithm.
With the application running, the engineer needed to modify
parameters that were located in RAM. Before the application was stopped
to modify the RAM contents, the engineer disabled the "Stop Timer_B clock on emulation stop"
option in the clock control configuration dialog.
If manually stopping program execution the duty cycle of the output
transistor was, for example, around 20%, this duty cycle would have
been maintained because the timer was still running and generating the
proper PWM waveform. If the timer would have been stopped, this could
have caused the output to permanently drive the switching transistor,
overloading and possibly damaging the circuitry.
Conclusion
With the EEM features presented here, debugging becomes more advanced
and much easier when compared with some common basic embedded emulation
implementations. Another advantage is that there is no additional
tool-cost such as for an ICE, as all emulation functions are built into
the CPU core as a standard feature set.
With EEM, in-system debugging of complex real-world signal
processing applications that handle sensitive analog signals becomes
possible. If needed as an additional measure, galvanic insulation can
be easily added to the few signal lines that are used to communicate
with the CPU core (such as a JTAG interface) by using an isolated debug
interface.
This option would not be easily possible with the 64+ signal lines
of a comparable ICE. However, certain premium debug features like code
coverage, profiling, and a deep trace buffer will still require an ICE.
Future extensions such a profiling, improved real-time system access
and a deeper trace buffer are in development, further reducing the gap
to an ICE and supporting the embedded developer's needs.
Andreas Dannenberg is an MSP430
Applications Engineer for Texas Instruments
in Dallas, Texas. During his three years at the company, he has worked
on MSP430 software development and hardware design, as well as C2000
and C6000 DSP application development.