Everyone who has used a debugger knows what breakpoints are, but when combined with other debugger features breakpoints become much more powerful tools. In this article, I will talk about how to combine these features and give examples of how such combinations can be used to solve some real world problems. In essence, this is a list of tips and tricks – some using familiar features in innovative ways, and some using lesser-known features of which you may not already be aware.
To start with, a large proportion of the embedded industry still uses what boils down to printf debugging. For small or simple programs, this method can work well. However, when a system grows in size and complexity, this can become a serious road block to fixing bugs quickly.
For instance, I visited one company that had a program that took over an hour to re-link. They were still trying to debug with printf , but due to the rebuilds required, they were able to track down just a couple of problems per day at best. More complex problems were taking days or even weeks to track down.
A slow build might not be the only reason that printf debugging could kill your productivity: You may do initial development on a desktop machine which has fast output, but when you try to port to your embedded system it may only have a slow serial port for output (or perhaps no output mechanism at all other than turning a light on and off). If it is slow enough, dumping even a little printf output may make the system non-functional.
Tip #1. printf Breakpoints: In spite of these issues, there are good reasons that printf is so commonly used: It's simple, it works most of the time, and most importantly, developers are used to it. So the first use of breakpoints will be to replicate the familiar approach of printf , but from within the debugger. This gets you most of the advantages of printf with much more flexibility. To emulate printf debugging:
1) Make a breakpoint where you want to print some information about your program
2) Add commands to that breakpoint which will print that information
3) As the last command in the breakpoint, resume execution of the program
The result will be a printf -style log of the output of your program, only now printed out from within the debugger. This approach has several advantages over traditional printf debugging:
* The debugging output is available from within your debug session, and does not depend on some other target output mechanism which may be slow, unreliable, or not available.
* If the program goes into a bad state, then printf output can be lost or corrupted. With a debugger, you know that the output you see is from the actual state of the target, and there is no chance that the target has corrupted or dropped it.
* It is possible to add and remove logging without needing to recompile your program.
* It is possible to print information about any part of the program from within the breakpoint. With printf debugging, you need to have programmatic access to the variable you are interested in, which may require creating a new interface just to get this data.
The problem with this emulated printf approach is that in some environments your program can end up running significantly more slowly than it would using normal printf. This is because hitting and resuming from a breakpoint can be an expensive operation.
Whether this is true of your system or not depends on the speed of your debug connection, the speed of your debugger and its ability to debug your program, and the speed of your printf output mechanism. However, if hitting and resuming from breakpoints is slower than printf there are still some things that you can do to make use of this approach:
1) Check to make sure that it actually matters. Many bugs will still reproduce, they may just need a few more seconds of run time. If the slower runtime it isn't actually a problem you don't need to do anything about it.
2) Don't turn the printf-breakpoints on until your program has reached the point you want to start inspecting. There's a good chance point 1 will apply for a smaller chunk of your program.
3) Depending on what you are looking for, you can make a small change in your program to introduce a code path that is only executed when the case you are interested in happens. Set your printf breakpoint there. For instance:
You can also combine all three of these approaches. For instance, if you have a top level event loop you could set a breakpoint there that determines if some interesting condition is true. If it is, then that breakpoint would then enable all of the printf breakpoints that you have set but left disabled throughout the rest of the system. If the event loop is slowed down enough by this breakpoint, then encode the condition into the event loop as in the third approach that I listed above.
Tip #2. Call Stack Breakpoints: Moving beyond printf debugging, there are a number of other techniques that are possible when combining different debugger features. Related to printf breakpoints are “call stack breakpoints”. In this case, instead of just printing out a particular variable when a breakpoint is hit, we ask the debugger to dump out a full call stack. Later, analyze the call stack log to see if there is anything unexpected going on.
I used this technique recently when I suspected that a particular function was being called in an unexpected way, but I didn't know how or why. To track the problem down, I set a breakpoint on the function in question that would dump out the call stack and resume. When I looked through the output, one call stack stood out because it was significantly deeper than the others. Taking a closer look, I realized that it was being called recursively, though I still didn't understand quite why.
Tip #3. Breakpoint Recursion Detection: At this point, I could have added code to the function to detect when it was called recursively, in order to create a place to set a breakpoint. However, it takes several minutes to rebuild and link the application that I was debugging, so instead, I used the debugger to perform the recursion detection. This technique required setting two breakpoints and using a debugger variable I created called IS_RECURSED , which I initialized to 0:
1) I put the first breakpoint on the start of the function. This checked to see if IS_RECURSED was 0. If it was 0, then it set IS_RECURSED to 1 and resumed. If it was 1, it would halt the program so that it could be debugged.
2) I put the second breakpoint on the exit of the function. This set IS_RECURSED back to 0 and resumed the program.
In this case, there was only one way for the function to return. If the function could return in more than one place, I would have found the specific call that was resulting in recursion. Then I would have the first breakpoint on that call, and the second breakpoint on the line after that call.
Using Breakpoints to Solve Hard-to-Reproduce Problems
The techniques that I have talked about up to this point generally work well when you are able to reproduce a problem fairly easily. Often the most time consuming problems, however, are those that are difficult to reproduce. In the worst cases, you may think you have fixed it, but since you could not get it to reproduce in the first place, you may have simply fixed (or introduced) a different problem.
Tip #4. Forcing Program Flow with Breakpoints: Sometimes you might have a suspicion about what is going wrong, but you just can't get the program to take the dubious code path. When you find yourself in this situation, you can often use the debugger to force the code path to be taken.
If your program only goes through that code path once, you can do it manually. However, if the program needs to go through that code path many times in order to get to the state that you want to debug, you can use breakpoints to automate the task of redirecting the program flow.
The first way is to use a scripted breakpoint to set a variable to the suspect value and then resume. In some situations, however, you may find it easier to set a breakpoint to change the program counter to some other line and resume. Using this technique, you can quickly try out different conditions that you are interested in without needing a full edit-compile-test cycle.
Tip #5. Controlling Multi-Thread Systems with Breakpoints: It is even more likely that problems will be difficult to reproduce when dealing with multi-thread or multi-process systems. For these problems you can control the timing of program flow with scripted breakpoints to force things to happen in a specific order.
This is very useful when tracking down communication problems between multiple threads (in the same address space or not) that do not reproduce in normal situations. Once you have done this, you can then reproduce the problem every time. Something that was once very difficult to track down now becomes trivial.
Tip# 6. Automatically Set Breakpoints on All Failure Cases: Programs frequently have internal checks built into them to help detect problems. When one of these internal checks is triggered, it may cause the program to dump some diagnostic information and then exit.
If one of these conditions trips while you are using the debugger, it can be frustrating because the program has just exited, thus preventing you from debugging the problem any further. Most debuggers have some way to automatically load breakpoints into a program.
I suggest using this capability to set breakpoints on all of the assert, warning, error, panic and other such functions in your program. This way you will have access to any information about your program when something unusual happens, not just what happens to be dumped out whenever a problem arises.
Tip #7. Using Breakpoints to Keep you Honest: On a related note, I believe that it is important to always run a program under the debugger, even very early on in the development cycle. This is because you never know when a problem will appear during your regular development. Studies have shown over and over again that the sooner a problem is found and fixed, the less expensive that fix is.
When a problem is encountered while the program is not being debugged, it is all too easy to do a cursory inspection of the problem, see that the debug output doesn't show enough to diagnose the problem, and go back to development.
Even if the developer does bring the program up under a debugger and attempts to reproduce the problem, it may not reproduce on the second try. If the problem is sporadic, the developer may have just lost their only opportunity to track down a problem. That problem may not show up again until after it has been shipped, causing costly recalls.
If the developer who witnessed the problem suspects that it is in an area of code owned by a different developer, having it up under the debugger is very useful for another practical reason as well. I don't have any studies to prove this, but I've seen it over and over again in many different situations ” it is much easier to get a developer to actually look at a problem when it is halted in a debugger and easy to look at.
It is simply too easy to look at a diagnostic dump and point fingers at someone else or shrug it off as something that can't be reproduced. If it is up under the debugger, theories can be formed about who might be responsible. That person can then be brought in to look at the problem, and they can use their knowledge of the code and information that the debugger provides to reason about what is going wrong.
Tip #8. Putting it All Together: Using Breakpoints to Find Bugs: Finally, I'm going to describe a technique that I used to track down a particularly difficult threading issue. This is a good example of combining several different features of a debugger to help find a problem.
In this case, I was working on a threaded application which was occasionally misbehaving because an event was being missed. After much debugging, I got to the point where I suspected the issue was a call that was not supposed to modify a data structure. The code looked like this:
I suspected that, very rarely, while the second line was executing, some new data would be delivered and put into the entity data structure. I was not familiar with the ReceiveUpdate function, but the developer who owned that function assured me that when the lock was taken, no other thread was allowed to modify the entity structure. I suspected that this condition was being violated, so I decided to try to use the debugger to get to the heart of the problem.
To do this I set a software breakpoint on Line 2. When hit, that breakpoint would set a hardware breakpoint on any write to entity->UpdateBits and then resume the target. Then I set another breakpoint on line 3 that would remove the previously set hardware breakpoint and resume.
The effect of these two software breakpoints was to detect whenever the entity->UpdateBits variable was modified, but only during the ReceiveUpdate method, and only when called from this particular function. The problem reproduced infrequently, so after installing these breakpoints, I went back to my normal debugging. A few minutes later, the target stopped on a hardware breakpoint, and indeed it was because in some cases the ReceiveUpdate function was releasing the lock and allowing another thread to run.
This technique of using breakpoints to set other breakpoints is very powerful. It takes some getting used to, but it's amazing what you can do with a bit of creativity. Take hardware breakpoints ” they only work on addresses, so normally they are only considered useful for watching changes to global variables. However, if you set and clear the hardware breakpoint with software breakpoints in a specific function, you are now able to set hardware breakpoints on stack variables.
What you Need in a Debugger
Unfortunately, not every debugger supports doing all of the tasks outlined above. The following are some of the requirements for a debugger that are needed to take full advantage of the techniques I discussed:
* The debugger needs to be able to quickly set and manage lots of breakpoints. This applies to the debugger's ability to manage large numbers of breakpoints, and the user interface that the debugger provides.
* The debugger has to be able to run commands when breakpoints are hit, and resume from breakpoints after running those commands. These operations need to be as fast as possible in order to keep the program running.
* The debugger has to be able to set and remove breakpoints (especially hardware breakpoints) when executing commands when a breakpoint is hit.
* The debugger has to be able to run your program with minimal to no intrusiveness when breakpoints are not being hit.
* The debugger has to support setting hardware write breakpoints (at least if your target hardware supports them).
* The debugger needs to provide a simple way to move the program counter from one line within a function to another.
* The debugger needs to be able to execute command line procedure calls from within a breakpoint.
* The debugger needs to be able to debug multiple threads at the same time and be able to set and remove breakpoints on any thread from any other thread.
Some fairly simple techniques using breakpoints and other features that most debuggers provide give you enormous power and visibility into your program. It is important to realize that in many ways, the debugger has more power and control over your program than you did when you were writing the program in the first place. You are limited only by your own creativity in the ways that you combine and use the abilities that the debugger provides to you.
Nathan Field is an Engineering Manager for the MULTI Integrated Development Environment group at Green Hills Software, Inc., which includes responsibility for the MULTI debugger. Nathan is a graduate of Harvey Mudd College and is keenly interested in tools that reduce the time and pain of the debugging phase of a software project. You can reach him at firstname.lastname@example.org .