Monitor-Based Debugging -

Monitor-Based Debugging

A ROM monitor is an inexpensive, but powerful, debugging aid. Follow these steps to make a basic monitor even more powerful.

What do you think of when you think about debugging an embedded system? I suppose this depends on your background and the budget allocated for tools. You”re likely to consider a JTAG/BDM-based debug port, an emulator or logic analyzer, printf() , or one of the many sophisticated source-level debuggers available today. Each solution has its own set of pros and cons. Some are very powerful, but come at a hefty price. Some are tied to a particular compiler tool set, while others are only useful on certain CPU families. The JTAG/BDM-based debug port is probably the most common now because it strikes a good balance between cost and capability. These devices can still cost a few thousand bucks and are useful only if connected to the system, usually requiring some bulky pod to hang fairly close to the target.

The topic of this article is somewhat of a dying art: monitor-based debugging. What is monitor-based debugging? First, it assumes that your application is being run on a system that boots with a monitor of some kind. The boot monitor is the base platform on which the application resides, and if the application crashes the monitor takes over.

When in ROM

Monitor-based debugging relies on capabilities built into the processor, and in most cases (not always), this requires the ability to write into your instruction space. Typically, breakpoints are set in the monitor's command line interface (CLI), then control is turned over to the application. If the instruction at which the breakpoint is set executes, a branch is taken out of the application and control is returned to the monitor's CLI. Now the monitor has some ability to display memory, maybe single step, and maybe even return to the application at the location where the breakpoint was hit.

Sounds pretty good right? Well, it can be, but a lot of complications can come up, such as:

  • How do you display memory? Typically, a monitor can display raw memory as a block of 1-, 2-, or 4-byte units, but you have to specify the address in hex. Since you are running at the CLI of the target, it's likely that all you can do is refer to the output map file generated by the linker to determine where the symbol is in data/bss address space. You have to correlate the symbol to some hex address. If the build changes, so does the memory map, so the next time you want to look at the same piece of data, you have to once again make the address-to-symbol correlation. In addition, the monitor does not know how to display in the format that you want, like shorts, longs, and char strings. And forget about structure display.
  • How do you set the breakpoint? Similar to the previous problem, you first look up the address of the function, then you issue some command like “b 0x123456” where 0x123456 is the address of the function at which you want to set the breakpoint.
  • How does the monitor talk to the serial port once the breakpoint occurs? The monitor takes over as a result of a breakpoint; but the application now owns the serial port. If the monitor reinitializes the serial port for its use, the interface between the application and this port is likely to be messed up. This means it will be very difficult to return control to the application.
  • How does the monitor temporarily shut down the application? This gets tricky when the application is running on an RTOS, with interrupts enabled and a variety of different peripherals configured.
  • Single-stepping is now at the assembler level, not the C-language level. This isn't very useful to the programmer.

Maybe this explains why monitor-based debugging isn't that popular anymore. Before we give up on it though, I”d like to re-investigate the topic and see if there might be some breath left in the old beast.

Debug philosophy

Let's start by setting some boundaries. We have to accept the fact that a monitor-based debugging environment has limitations. After we establish some guidelines and re-think the way some of this stuff is done, I think you”ll agree that there is quite a bit of capability still left. So let's establish a debugging model for the boot monitor. What do we get, and what do we sacrifice?


For memory display, we will have the ability to display “C-like” data: character strings, chars, shorts, ints, longs, and even data structures. The data structures can be displayed individually, in tables, or as a linked list. We will have the ability to reference data symbolically. This means that a global variable called “SysTick” can be referenced as “SysTick” without any need to know where it is in memory. Instead of thinking within the confines of a typical breakpoint, we will have run-time analysis that includes the breakpoint as one of its features. There are standard breakpoints that terminate execution of the application and “auto-return” breakpoints used for runtime analysis. After a breakpoint or exception is trapped by the monitor, a symbolic stack trace can be done.


We will not have any access to source code line number information; however, if the compiler supports the ability to dump mixed source and assembly code, this gets us around that problem. We will not consider single-stepping because it is assembly level; however, if we implement everything else, assembly-level single-stepping is a handy freebie. Since we only have access to global variables, the stack trace will not display parameters passed, nor will we be able to retrieve local data from a stack. The final, and most significant limitation put on this debugger is that control can't be returned to the running application after a hard breakpoint.

If these considerations are acceptable, the end result is a debugging environment that lives with the application. It can be shipped with the product and used in the field, and is somewhat independent of the compiler toolset, RTOS used, and to a degree, the CPU and hardware.

Monitor assumptions

For the remaining discussion, we will make some assumptions about the underlying monitor facility:

  • The monitor includes a flash file system.
  • The monitor's CLI can process symbols based on a file that is in the file system.
  • The monitor can execute files as a list of commands (a script).

For example, if we have a file called symtbl in flash and it has the following lines in it:

main	0x123456func1	0x123600func2	0x123808varA	0x128000varB	0x128004varC	0x12800c  

and we execute a script with the following two lines in it:

echo The address of main() is %main
echo “varA” is located at %varA

The output will be:

The address of main() is 0x123456
“varA” is located at 0x128000

This demonstrates the monitor's ability to interact with its own file system, process symbols based on a special file called symtbl , and execute a script that can be a series of monitor commands that use the symbol lookup capability of the monitor. The only thing needed is the ability to generate a symtbl file. The specifics on how to create this symtbl file will depend on the toolset (compiler/linker) used.


A breakpoint forces the application (at a particular address or event)to turn control over to the monitor/debugger. When the application relinquishes control, all context is made available to the debugger so that it can display variables, dump the stack, and so on. For our discussion there are two distinct types of breakpoints: hard and soft.

A hard breakpoint allows the monitor's CLI to take control. There is no way to resume or continue the application code once this breakpoint occurs, unless the application is restarted.

Soft breakpoints (also called tracepoints ) are used for run-time analysis (an “auto-return” breakpoint). The breakpoint occurs and the monitor code (through the exception handler) is executed, but the monitor returns control to the application in real time. As will be seen in the text below, this is the harder of the two to implement.

There are several different types of soft breakpoints. Each type alters some state maintained by the monitor so that statistics can be gathered and used by the developer or by the monitor itself to possibly change the action taken by the soft breakpoint handler.

Both types of breakpoint are usually inserted into the application by replacing the instruction at the specified address with some other instruction that causes an exception to occur (from this point on we will refer to this instruction as a trap).1 The exception handler used must be configured to enter the monitor after storing the entire context (or register set) of the CPU. Some debugger-specific code is executed and, depending on the type of breakpoint, the monitor's CLI comes up (hard breakpoint) or control is returned to the application (soft breakpoint).

When control is automatically returned to the application, the monitor must restore all the registers that were active at the time of the breakpoint and return to the point where the breakpoint occurred. To do this, the instruction must be reinserted at its original location. That single instruction must be executed, then the trap must be re-inserted into the memory so that if that instruction is ever executed again, another breakpoint will occur. Also, we need to be aware that we are using data accesses to modify instruction space; hence, we may have to deal with a cache coherency issue. This is pretty complicated. Let's list the steps:

1. If the application controls all exception handlers, we must re-configure the exception handler corresponding to the trap so that it points to code owned by the monitor. To support soft breakpoints, we must also configure the processor's trace (or single step) exception handler to be owned by the monitor.
2. Insert trap(s) into the instruction address space (be aware of cache).
3. Turn control over to the application that is to be debugged.
4. At the time of the exception, copy all registers to a local area accessible by the monitor.
5. Determine the type of exception and take the appropriate action. If it's a hard breakpoint, branch to the monitor's CLI; otherwise, take the appropriate action based on the type of soft breakpoint and continue with Step 6.
6. Install the original instruction back into the address space (be aware of cache).
7. Restore the register context that was stored away earlier.
8. Put the processor into “trace” mode and return from the exception to the address that now contains the original instruction.
9. Immediately after this instruction is executed, the trace exception will occur and the monitor code must now reinstall the trap instruction and once again return control to the application. This re-installation is necessary so that the breakpoint will be active for the next time the CPU executes that instruction.

The code behind all this is certainly not trivial to implement, and because of the processor specifics, it is beyond the scope of this article. Varying degrees of complexity can be avoided, depending on what functionality is needed. The soft breakpoint could be limited so that it causes the breakpoint to occur only on the first pass through the code. After the first occurrence, the original instruction is put back into memory and full-speed execution is resumed. This eliminates the complexity of Steps 7 and 8, but also eliminates the ability to take that breakpoint again in real-time. To simplify things further, the whole soft breakpoint mechanism can be omitted and then the exception handler doesn't even have to worry about anything after storage of the register context in Step 4. In many systems, this is a reasonable limitation and eliminates a great deal of the complexity of the exception handler.

Code analysis

This section assumes that you are going to bite the bullet and implement the whole nine yards discussed in the previous section. The idea behind code analysis at this level is to provide the developer with some convenient mechanisms through which information can be gathered while the application is running at (almost) full speed. It's the use of the “soft” breakpoint mechanism described above.

Instead of establishing some set of soft breakpoints, let's configure the breakpoints so they can pass through some user-configurable logic (part of the monitor code) to determine what to do as a result of a breakpoint. Change the model just a bit: instead of just setting a breakpoint, we will set up some logical set of steps to occur as a result of some event. The event is the processor taking some kind of “breakpoint” exception and the logic is a piece of breakpoint-specific code that can be configured to perform an action or perform an action based on some condition. The monitor command syntax looks like this:

at {breakpoint tag} [if condition] {action}

There are three pieces to this command syntax after the command itself:

  • The breakpoint tag correctly implies that the “at” command must be coordinated with some other processor-specific mechanism that sets breakpoints (using some of the methods just discussed). This tag is processor specific because the breakpoint mechanism is processor specific; however, this “at” mechanism is processor independent.
  • The if condition is an optional test that can become part of the logic to decide whether or not to perform the action. The conditions can be the non-zero return of a specified function call, an “at” variable reaching some count, or the “at” flag containing some predetermined bit setting.
  • The action is an optional operation controlled by the command that adjusts state or hands control to the monitor.

A few examples illustrate this idea:

Example 1 : The following sequence of “at” commands will establish a counting breakpoint based on the exception that occurs as a result of a DATA_1RD breakpoint. The breakpoint mechanism is CPU dependent, so for the sake of this discussion, it might be an exception that occurs using the first data breakpoint provided by the CPU. When the exception occurs, the “at handler logic” is part of the exception handler and for each “at” statement established by the user, there is one pass through the logic. The first pass increments the ATV1 variable (within the context of the “at” command) and the second pass checks to see if ATV1 is 5 and, if it is, halt the application and turn over full control to the monitor's CLI.

# Increment internal variable
at DATA_1RD ATV1++
# If ATV1 equals 5, then break
at DATA_1RD if ATV1==5 BREAK

Example 2 : This next set of “at” commands will break when breakpoint ADDR_1 is executed after breakpoint ADDR_2 has executed:

# Set bit 1 of the internal flags
at ADDR_1 FSET01
# If both bits 1 and 2 are set, break
# Clear bit 1 of the internal flag
at ADDR_2 FCLR01
# Set bit 2 of the internal flags
at ADDR_2 FSET02

Example 3 : This example demonstrates the idea of using functionality within the application to aid in the code analysis. The break will occur if the function at address 0x1234 returns 1.

at ADDR_1 0x1234()==1 BREAK

Example 4: As a final example, let's use the “at” command to help detect a memory leak. Assume ADDR_1 is malloc, ADDR_2 is free, and at the time ADDR_3 is hit, we expect no allocated memory to be available. We can verify the differential between malloc/free calls by observing the content of the ATV1 variable after the breakpoint:

# Increment ATV1 at ADDR_1 (malloc)
at ADDR_1 ATV1++
# Decrement ATV1 at ADDR_2 (free)
at ADDR_2 ATV1
# Break at ADDR_3

Get the idea? The point is that there is a lot more capability behind a simple breakpoint that can be taken advantage of. The back-end, CPU-specific stuff is the same stuff that would be used to implement basic breakpoints, but adding the “[if-condition] {action}” extension puts a whole new spin on monitor-based breakpoints. Note that there is a bit of a real-time hit here because you are inserting code into the runtime stream of the application. This must be considered, but the minor hit is usually acceptable.

Debug hooks

One of the limiting factors of monitor-based debugging is that it usually requires that the instruction space be modifiable. For the vast majority of embedded designs, this is not the case because the instruction space is in EPROM or flash2 , and the CPU executes the code directly from that space.

Don't despair! Many of today's processors are equipped with special debug capabilities that overcome this limitation. Debug registers add to the versatility of the monitor-based debugger because the CPU can be configured to take a breakpoint based on one instruction address (or a range) without the requirement that the instruction space be written. Additionally, some support data breakpoints. This means that a breakpoint can be established based on a piece (or range) of data being accessed. The breakpoint can usually be established based on a read and/or write of the data space.

This added debug capability (different capabilities from different CPUs) means that the monitor must be able to deal with it. Instead of the generic mechanism of inserting some trap into the instruction space, you now have to be able to configure some set of registers to do something special. Data breakpoints add even more complexity to the monitor code (but they”re worth it). This is because the monitor cannot use the address at which the exception occurred to determine what breakpoint was hit. The breakpoint handler must first look at the CPU state to see if the exception occured as a result of a data-access breakpoint. If it was a data-access breakpoint, then the address at which the exception occured cannot be used to determine which breakpoint was hit. Other CPU states must be retrieved to determine the source.

Memory display

Almost any monitor will provide some type of memory display command. Similar to the above breakpoint facility, if it is implemented correctly, it can be a useful tool, even for the high-level software developer. If the display supports the hex and decimal display of address space along with support for 1-, 2-, and 4-byte data units, then that plus the CLI's fundamental ability to deal with symbols allows the monitor to display variables in their appropriate form. For example, let's assume we have a short variable called varB and we want to display it in decimal. This might be done with:

dm -2d %varB 1

where “dm” is the display memory command name, “-2d” is an option string indicating that the data is to be displayed in 2-byte decimal units, “%varB” is the name of the variable to be displayed, and “1” indicates that only one unit is to be displayed. The result is that you have the ability to display variables just as you would with a high-level debugger, and all you need to do is make sure that your on-board symtbl file is in sync with the application being debugged. You could go one step further and build a few simple scripts that save on the typing. For example, we can build the following scripts for displaying various different integer formats:

file int2:	dm -2d $ARG1 1file uint4:	dm -4 $ARG1 1  

Now instead of typing “dm –2d %varB 1” the int2 script could be used:

int2 %varB

A simple “-s” option could also be incorporated so that memory could be displayed as character strings instead of raw hex. The code for the dm command is shown in Listing 1.

Listing 1

/* Dm(): * Display memory. * * Arguments... * arg1: address to start display * arg2: if present, specifies the number of units to be displayed. * * Options... * -2 a unit is a short. * -4 a unit is a long. * -b print chars out as is (binary). * -d display in decimal. * -f fifo-type access (address does not increment). * -m prompt user for more. * -s print chars out as is (binary) and terminates at a null. * -v {varname} assign last value displayed to shell var varname. * * Defaults... * Display in hex, and unit type is byte. */

char *DmHelp[] = { "Display Memory", "-[24bdfs] {addr} [cnt]", " -2 short access", " -4 long access", " -b binary", " -d decimal", " -f fifo mode", " -m use 'more'", " -s string", " -v {var} load 'var' with element at addr", 0,};

#define BD_NULL 0#define BD_RAWBINARY 1#define BD_ASCIISTRING 2

intDm(int argc,char *argv[]){ int i, count, width, opt, more, size, fifo; int hex_display, bin_display; char *varname, *prfmt, *vprfmt; uchar *cp, cbuf[16]; ushort *sp; ulong *lp, add;

width = 1; more = fifo = 0; bin_display = BD_NULL; hex_display = 1; varname = (char *)0; while((opt=getopt(argc,argv,"24bdfmsv:")) != -1) { switch(opt) { case '2': width = 2; break; case '4': width = 4; break; case 'b': bin_display = BD_RAWBINARY; break; case 'd': hex_display = 0; break; case 'f': fifo = 1; break; case 'm': more = 1; break; case 'v': varname = optarg; break; case 's': bin_display = BD_ASCIISTRING; break; default: return(0); } }

add = strtoul(argv[optind],(char **)0,0); if (hex_display) vprfmt = "0x%x"; else vprfmt = "%d";

do { if (argc-(optind-1) == 3) { count = strtoul(argv[optind+1],(char **)0,0); count *= width; } else count = 128;

if (bin_display != BD_NULL) { cp = (uchar *)add; if (bin_display == BD_ASCIISTRING) { puts(cp); if (varname) { shell_sprintf(varname,vprfmt,cp+strlen(cp)+1); } } else { for(i=0;i 0) { printf("%08lx: ",(ulong)cp); if (count > 16) size = 16; else size = count; for(i=0;i<16;i++) { if (i >= size) puts(" "); else { cbuf[i] = *cp; printf(prfmt,cbuf[i]); } if (i == 7) puts(" "); if (!fifo) cp++; } if ((hex_display) && (!fifo)) { puts(" "); prascii(cbuf,size); } putchar('n'); count -= size; if (!fifo) { add += size; cp = (uchar *)add; } } } } else if (width == 2) { sp = (ushort *)add; if (hex_display) prfmt = "%04X "; else prfmt = "%5d "; if (varname) { shell_sprintf(varname,vprfmt,*sp); } else { while(count>0) { printf("%08lx: ",(ulong)sp); if (count > 16) size = 16; else size = count; for(i=0;i0) { printf("%08lx: ",(ulong)lp); if (count > 16) size = 16; else size = count; for(i=0;i

/* prascii(): * Print the incoming data stream as ascii if printable, else just * print a dot. */voidprascii(uchar *data,int cnt){ int i;

for(i=0;i 0x1f) && (*data < 0x80)) printf("%c",*data); else putchar('.'); data++; }}

Data structures
Now let's take monitor-based memory display to a whole new level. Wouldn't it be nice to be able to display memory as structures and linked lists?

The problem with doing this at the monitor level is that the monitor doesn't usually have access to the information that the compiler/linker provides regarding the format of a structure. Since we are now assuming we have a file system, one might think that we could put the toolset-generated data in a file and allow the monitor to parse through it. We could, but parsing this data (the symbol table generated by the compiler) can be complicated, especially when you consider the fact that the format of this file could be very different from one compiler to the next. Even if we limit ourselves to a particular object file format (say ELF), the symbol table format may not be the same from one compiler to the next.

A simpler approach is to create a command in the monitor that can look to a structure-definition file in the file system to determine how to overlay a structure on top of a block of memory on the target. This eliminates all dependency on some external file format; hence, it works regardless of CPU type or toolset chosen.

The structure definition file is an ASCII file that contains some structure definitions almost as they would be seen within a C header file. The command in the monitor can then use this file as a reference when asked to display a particular block of memory. This, combined with the symbol-intelligent CLI, allows us to do something like this:

cast abc %InBuf

This command would look for the file structfile in the flash file system and, if found, it would overlay the structure defined as abc on top of the address that would be extracted from the symbol table for the symbol %InBuf. Add a few options to this and you can specify to the cast command what member of the structure is the “next” pointer. This turns it into a linked list display tool with almost no additional coding effort.

The structure definition used by this command is assumed to be in the structfile . In general, the format of this file is similar to that of standard C structure definitions, but with some limitations. The types “char,” “short,” and “long” are supported and will be displayed as a 1-, 2-, or 4-byte decimal integer respectively. To support the ability to display in hex, the types “char.x,” “short.x,” and “long.x” are supported, and if a character is to be displayed as a character (hex 0x31 printed as “1”), the “char.c” type can be applied. For example, in the following structure definition:

struct abc {	long i; 	long.x j;	char.c c;	char.x d;	char e;}  

The member “i” would be displayed in decimal format; “j” would be displayed in hex. The member “c” would be displayed as a character, “d” would be displayed in hex, and “e” would be displayed as a 1-byte decimal integer. If a structure has an array in it, then the user must define that as an array of one of the fundamental types I described with the appropriate size. This “cast” command does not display arrays within structures simply because of the complexity of the output generated. So it is treated like padding with only the name and array size displayed.

Here is an example of a structure definition file that demonstrates all of the functionality of the “cast” command. Note that the “#” sign signifies a comment.

struct abc {	long i;	char.x c1;	pad[3];  # Not displayed	struct def d;}struct def {	short s1;	long ltbl[32];  # Not displayed 	short s2;}  

Notice the embedded structures, use of the “.x” suffix, and the pad[] format. This implementation is totally unaware of compiler-specific padding and CPU-specific alignment requirements. If the structure definition puts a long on an odd boundary and the CPU does not support that, then cast is unaware of this limitation and is likely to cause an exception itself. The user must add the appropriate padding to deal with this. As a result, the “pad[]” descriptor is used for CPU/compiler-specific padding. If the member is of type “char.c *” or “char.c [],” cast will display the ASCII string (if you don't want it to be dereferenced, use “char.x *”).

The code for the cast command is shown in Listing 2.

Listing 2

/* cast.c: * The cast command is used in the monitor to cast or overlay a structure * onto a block of memory to display that memory in the format specified * by the structure. The structure definition is found in the file * "structfile" in TFS. Valid types within structfile are * char, char.c, char.x, short, short.x, long, long.x and struct name. * Default format is decimal. The '.x' extension tells cast to print * in hex and the '.c' extension tells cast to print the actual character. */#include "config.h"#include "tfs.h"#include "tfsprivate.h"#include "ctype.h"#include "genlib.h"#include "stddefs.h"


static ulong memAddr;static int castDepth;

#define OPEN_BRACE '{'#define CLOSE_BRACE '}'


#define STRUCT_SHOWPAD (1<<0)#define STRUCT_SHOWADD (1<<1)#define STRUCT_VERBOSE (1<<2)

#define STRUCTFILE "structfile"

struct mbrinfo { char *type; char *format; int size; int mode;};

struct mbrinfo mbrinfotbl[] = { { "char", "%d", 1 }, /* decimal */ { "char.x", "0x%02x", 1 }, /* hex */ { "char.c", "%c", 1 }, /* character */ { "short", "%d", 2 }, /* decimal */ { "short.x", "0x%04x", 2 }, /* hex */ { "long", "%ld", 4 }, /* decimal */ { "long.x", "0x%08lx", 4 }, /* hex */ { 0,0,0 }};

/* castIndent(): * Used to insert initial whitespace based on the depth of the * structure nesting. */voidcastIndent(void){ int i;


/* strAddr(): * Called by showStruct(). It will populate the incoming buffer pointer * with either NULL or the ascii-hex representation of the current address * pointer. */char *strAddr(long flags, char *buf){ if (flags & STRUCT_SHOWADD) sprintf(buf,"0x%08lx: ",memAddr); else buf[0] = 0; return(buf);}

/* showStruct(): * The workhorse of cast. This function parses the structfile looking for * the structure type; then it attempts to display the memory block that * begins at memAddr as if it was the structure. Note that there is no * pre-processing done to verify valid syntax of the structure definition. */intshowStruct(int tfd,long flags,char *structtype,char *structname,char *linkname){ struct mbrinfo *mptr; ulong curpos, nextlink; int i, state, snl, retval, tblsize; char line[96], addrstr[16], format[64]; char *cp, *eol, *type, *eotype, *name, *bracket, *eoname, tmp;

type = (char *)0; retval = nextlink = 0; curpos = tfsctrl(TFS_TELL,tfd,0); tfsseek(tfd,0,TFS_BEGIN); castIndent(); if (structname) printf("struct %s %s:n",structtype,structname); else printf("struct %s @0x%lx:n",structtype,memAddr); castDepth++;

state = STRUCT_SEARCH; snl = strlen(structtype);

while(1) { if (tfsgetline(tfd,line,sizeof(line)-1) == 0) { printf("Structure definition '%s' not foundn",structtype); break; } if ((line[0] == 'r') || (line[0] == 'n')) /* empty line? */ continue;

eol = strpbrk(line,";#rn"); if (eol) *eol = 0;

if (state == STRUCT_SEARCH) { if (!strncmp(line,"struct",6)) { cp = line+6; while(isspace(*cp)) cp++; if (!strncmp(cp,structtype,snl)) { cp += snl; while(isspace(*cp)) cp++; if (*cp == OPEN_BRACE) state = STRUCT_DISPLAY; else { retval = -1; break; } } } } else if (state == STRUCT_DISPLAY) { type = line; while(isspace(*type)) type++;

if (*type == CLOSE_BRACE) { state = STRUCT_ALLDONE; break; }

eotype = type; while(!isspace(*eotype)) eotype++; *eotype = 0; name = eotype+1; while(isspace(*name)) name++; bracket = strchr(name,'['); if (bracket) tblsize = atoi(bracket+1); else tblsize = 1;

if (*name == '*') { castIndent(); printf("%s%-8s %s: ",strAddr(flags,addrstr),type,name); if (!strcmp(type,"char.c")) printf(""%s"n",*(char **)memAddr); else printf("0x%lxn",*(ulong *)memAddr); memAddr += 4; continue; } mptr = mbrinfotbl; while(mptr->type) { if (!strcmp(type,mptr->type)) { castIndent(); eoname = name; while(!isspace(*eoname)) eoname++; tmp = *eoname; *eoname = 0;

if (bracket) { if (!strcmp(type,"char.c")) { printf("%s%-8s %s: ", strAddr(flags,addrstr),mptr->type,name); cp = (char *)memAddr; for(i=0;itype,name); memAddr += mptr->size * tblsize; } else { sprintf(format,"%s%-8s %%s: %sn", strAddr(flags,addrstr),mptr->type,mptr->format); switch(mptr->size) { case 1: printf(format,name,*(uchar *)memAddr); break; case 2: printf(format,name,*(ushort *)memAddr); break; case 4: printf(format,name,*(ulong *)memAddr); break; } memAddr += mptr->size; } *eoname = tmp; break; } mptr++; } if (!(mptr->type)) { int padsize; char *subtype, *subname, *eossn;

if (!strcmp(type,"struct")) { subtype = eotype+1; while(isspace(*subtype)) subtype++; subname = subtype; while(!isspace(*subname)) subname++; *subname = 0; subname++; while(isspace(*subname)) subname++; eossn = subname; while(!isspace(*eossn)) eossn++; *eossn = 0; if (*subname == '*') { castIndent(); printf("%s%s %s %s: 0x%08lxn",strAddr(flags,addrstr), type,subtype,subname,*(ulong *)memAddr); if (linkname) { if (!strcmp(linkname,subname+1)) nextlink = *(ulong *)memAddr; } memAddr += 4; } else { for (i=0;i

switch(state) { case STRUCT_SEARCH: printf("struct %s not foundn",structtype); retval = -1; break; case STRUCT_DISPLAY: printf("invalid member type: %sn",type); retval = -1; break; case STRUCT_ERROR: printf("unknown errorn"); retval = -1; break; } tfsseek(tfd,curpos,TFS_BEGIN); if (linkname) memAddr = nextlink; castDepth--; return(retval);}

char *CastHelp[] = { "Cast a structure definition across data in memory.", "-[apv] {struct type} {address}", "Options:", " -a show addresses", " -l{linkname}", " -n{structname}", " -p show padding", " -t{tablename}", 0,};

intCast(int argc,char *argv[]){ long flags; int opt, tfd, index; char *structtype, *structfile, *tablename, *linkname, *name;

flags = 0; name = (char *)0; linkname = (char *)0; tablename = (char *)0; while((opt=getopt(argc,argv,"apl:n:t:")) != -1) { switch(opt) { case 'a': flags |= STRUCT_SHOWADD; break; case 'l': linkname = optarg; break; case 'n': name = optarg; break; case 'p': flags |= STRUCT_SHOWPAD; break; case 't': tablename = optarg; break; default: return(0); } } if (argc != optind + 2) return(-1);

structtype = argv[optind]; memAddr = strtoul(argv[optind+1],0,0);

/* Start by detecting the presence of a structure definition file... */ structfile = getenv("STRUCTFILE"); if (!structfile) structfile = STRUCTFILE;

tfd = tfsopen(structfile,TFS_RDONLY,0); if (tfd < 0) { printf("Structure definition file '%s' not foundn",structfile); return(0); }

index = 0; do { castDepth = 0; showStruct(tfd,flags,structtype,name,linkname); index++; if (linkname) printf("Link #%d = 0x%lxn",index,memAddr); if (tablename || linkname) { if (askuser("next?")) { if (tablename) printf("%s[%d]:n",tablename,index); } else tablename = linkname = (char *)0; } } while(tablename || linkname);

tfsclose(tfd,0); return(0);}#endif

Stack trace

Aside from variable display, a stack trace is probably the most useful in the firmware developer's bag of tricks. It turns an exception that looks something like this:

addr align err at 0xf001400a

into something that looks like this:EXCEPTION:
addr align err at 0xf001400a

0xf001400a: error_func()
0xf0008040: funcXX()
0xf0011288: funcYY()
0xf0011148: task_ABCD()

A stack trace allows the developer to determine what function nesting got the code to that point. This can save lots of debug/analysis time.

Stack trace capability is usually considered something that is only offered by a full-blown debug environment. This doesn't have to be the case. Having implemented this function for a few different CPUs, I can tell you that this is a pain in the butt to get working. But once it works, you just can't live without it.

In our monitor-based stack trace we will use the contents of the symbol table file we”ve been using all along, but now we are going to make the assumption that all the symbols are listed in the file in ascending order. We will also limit our monitor-based stack trace to provide function nesting only.

The ability to look at variables on the stack is a bit more complicated. Think about it for a minute. At any point in execution time, the CPU must be able to retrace its steps because whenever any function returns, the function that called it continues. All of this return information is in the stack frame somewhere; we just need to find it. On the other hand, the ability to display variables within a particular function's stack scope is not a natural thing for the CPU. Hence, this does require more information from the compiler regarding how and where the variables are stored in the frame.

The majority of the code for a stack trace implementation is compiler and CPU specific. Some of the parsing of the symbol file is generic and can be reused on multiple implementations. As I mentioned, I found this to be a challenge to get working, but it has been well worth it. Listing 3 contains the code for a PowerPC stack trace and the address-to-symbol translator.

Listing 3

/* symtbl.c: * This file contains functions related to the symbol table option in * the monitor. */

/* SymFileFd(): * Attempt to open the symbol table file. First look to the SYMFILE env var; * else default to SYMFILE definition. If the file exists, open it and return * the file descriptor; else return TFSERR_NOFILE. */static intSymFileFd(int verbose){ TFILE *tfp; int tfd; char *symfile;

/* Load symbol table file name. If SYMFILE is not a variable, default * to the string defined by SYMFILE. */ symfile = getenv("SYMFILE"); if (!symfile) symfile = SYMFILE;

tfp = tfsstat(symfile); if (!tfp) return(TFSERR_NOFILE);

tfd = tfsopen(symfile,TFS_RDONLY,0); if (tfd < 0) { if (verbose) printf("%s: %sn",symfile,(char *)tfsctrl(TFS_ERRMSG,tfd,0)); return(TFSERR_NOFILE); } return(tfd);}

/* AddrToSym(): * Assumes each line of symfile is formatted as... * synmame SP hex_address * and that the symbols are sorted from lowest to highest address. * Using the file specified by the incoming TFS file descriptor, * determine what symbol's address range covers the incoming address. * If found, store the name of the symbol as well as the offset between * the address of the symbol and the incoming address. * * Return 1 if a match is found, else 0. */intAddrToSym(int tfd,ulong addr,char *name,ulong *offset){ int lno; char *space; ulong thisaddr, lastaddr; char thisline[84]; char lastline[sizeof(thisline)];

lno = 1; *offset = 0; lastaddr = 0; if (tfd == -1) { tfd = SymFileFd(0); if (tfd == TFSERR_NOFILE) return(0); } tfsseek(tfd,0,TFS_BEGIN); while(tfsgetline(tfd,thisline,sizeof(thisline)-1)) { space = strpbrk(thisline,"t "); if (!space) continue; *space++ = 0; while(isspace(*space)) space++; thisaddr = strtoul(space,0,0); /* Compute address from entry in symfile. */

if (thisaddr == addr) { /* Exact match, use this entry */ strcpy(name,thisline); /* in symfile. */ return(1); } else if (thisaddr > addr) { /* Address in symfile is greater */ if (lno == 1) /* than incoming address... */ break; /* If this is first line of symfile */ strcpy(name,lastline); /* then return error. */ if (offset) *offset = addr-lastaddr;/* Otherwise return the symfile */ return(1); /* entry previous to this one. */ } else { /* Address in symfile is less than */ lastaddr = thisaddr; /* incoming address, so just keep */ strcpy(lastline,thisline); /* a copy of this line and go to */ lno++; /* the next. */ } } sprintf(name,"0x%lx",addr); return(0);}

/* Stack trace for PPC403: Note that this stack trace can be used to trace REAL exceptions or user-induced exceptions. If one is user-induced, then the user is typically just replacing an instruction at the point at which the break is to occur with a SC (syscall) instruction. The point at which this insertion is made must be after the function sets up its stack frame otherwise it is likely that the trace will be bogus.

BTW... the SC instruction is 0x44000002.*/extern void showregs();extern int getreg();

extern ulong ExceptionAddr;extern int AddrToSym(int,ulong,char *,ulong *);

char *StraceHelp[] = { "Stack trace", "-[d:F:P:rs:]", " -d # max depth count (def=20)", " -F # specify frame-pointer (don't use content of R1)", " -P # specify PC (don't use content of SRR#)", " -r dump regs", " -s # srr reg num (def=0)", 0,};

intStrace(argc,argv)int argc;char *argv[];{ char *symfile, fname[64]; TFILE *tfp; ulong *framepointer, pc, fp, offset; int tfd, opt, srrnum, maxdepth;

tfd = fp = 0; srrnum = -1; maxdepth = 20; pc = ExceptionAddr; while ((opt=getopt(argc,argv,"d:F:P:rs:")) != -1) { switch(opt) { case 'd': maxdepth = atoi(optarg); break; case 'F': fp = strtoul(optarg,0,0); break; case 'P': pc = strtoul(optarg,0,0); break; case 'r': showregs(); break; case 's': srrnum = atoi(optarg); break; default: return(0); } } if (!fp) getreg("R1", &framepointer); else framepointer = (ulong *)fp;

if (srrnum != -1) { if (srrnum == 0) getreg("SRR0", &pc); else if (srrnum == 2) getreg("SRR2", &pc); else { printf("Invalid -s valuen"); return(0); } }

/* Start by detecting the presence of a symbol table file... */ symfile = getenv("SYMFILE"); if (!symfile) symfile = SYMFILE;

tfp = tfsstat(symfile); if (tfp) { tfd = tfsopen(symfile,TFS_RDONLY,0); if (tfd < 0) tfp = (TFILE *)0; }

/* Show current position: */ printf(" 0x%08lx",pc); if (tfp) { AddrToSym(tfd,pc,fname,&offset); printf(": %s()",fname); if (offset) printf(" + 0x%lx",offset); } putchar('n');

/* Now step through the stack frame... */ while(maxdepth) { framepointer = (ulong *)*framepointer;

if ((!framepointer) || (!*framepointer) || (!*(framepointer+1))) break;

printf(" 0x%08lx",*(framepointer+1)); if (tfp) { int match;

match = AddrToSym(tfd,*(framepointer+1),fname,&offset); printf(": %s()",fname); if (offset) printf(" + 0x%lx",offset); if (!match) { putchar('n'); break; } } putchar('n'); maxdepth--; }

if (!maxdepth) printf("Max depth terminationn"); if (tfp) { tfsclose(tfd,0); } return(0);}#endif

As one final note here, consider the case where your product is in the field and is occasionally “just resetting” (as described by the customer). This is usually some bad code causing an exception that simply resets the target. These kinds of problems can be very hard to reproduce because they only seem to happen on the customer site every third blue moon.

So how do you “catch” the bug in the act? You probably can't leave an emulator at a customer site, and it is very unlikely that the customer is going to allow you to debug on their site. With this stack trace capability and some of the other capabilities in the monitor, an environment can be set up so that any exception that occurs will automatically cause the monitor to dump the output of a stack trace to a file in the file system, then restart the application. Then you can occasionally query the system to see if this file is present, and if it is, transfer it to your system for analysis.


Monitor-based debugging doesn't provide all of the capability that comes with some of the debugging tools available on the market today. It is, however, a tool that can reside in ROM alongside the application. This means that it is still in the system when deployed in the field. There is no need for a special connection or even an additional serial port. This convenience can prove itself invaluable under a variety of circumstances.

Most of what I've described in this article is available in a boot monitor package called “MicroMonitor.” The entire monitor package can be downloaded from Lucent's Research Software Distribution web site at A lot more information on this topic as well as the entire topic of booting an embedded system is covered in my book Embedded Systems Firmware Demystified (CMP Books).

Ed Sutter graduated from the engineering program at DeVry Technical Institute, and received a bachelor's degree from New Jersey Institute of Technology. He is currently a distinguished member of the technical staff at Lucent. He has been writing code for embedded systems at Lucent/AT&T Bell Labs for about 15 years, for architectures ranging from the 8051 to MIPS R4000. His area of expertise includes embedded system bootup, device drivers, and RTOS BSP development, as well as Unix/Win32-based embedded system development/support tools. He can be reached at .


1. On the 68000 this would be a trap instruction. For the x86, use INT3. Each CPU has some instruction to implement the same functionality.

2. Yes, flash can be written-but not in a way that would allow our implementation of a breakpoint to be practical.

3. The leading percent sign tells the CLI that the string is a symbol and should be replaced with the hex address that is in the symtbl file. If no such file is present, then no replacement is made.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.