The basics of programming embedded processors: Part 7

Wayne Wolf

September 05, 2007

Wayne WolfSeptember 05, 2007

A program trace (or more succinctly, a trace) is a record of the execution path of a program that has been measured during program execution. If we can measure a trace from a running program, we avoid the need to analyze the program's CDFG to determine the execution path. Traces are typically large. Since we usually need to look at the entire execution history of a program, traces can be gigabytes in size.

However, because disks to hold the traces are inexpensive and workstations to analyze them are fast and cheap, traces are an appealing way to analyze programs. We have to consider two problems: how to generate a trace and how to use it to determine program performance.

A trace can be captured by using hardware methods. For example, a logic analyzer can be attached to the microprocessor bus to capture memory bus traffic. There are two major limitations to this approach. First, the size of hardware trace buffers is usually limited.

The hardware must be able to store instructions at execution speed; if we use high-speed RAM for the buffer, we are limited to perhaps several million instructions. This means that we probably cannot capture a trace of the entire program but only a section of its execution. Second, if the CPU has an on-chip cache, we will not be able to see internal memory references to the program or the data.

Alternatively, a microprocessor emulator can be used to trace the PC over time. Although this technique allows us to see what is going on inside the chip and not just behavior on the memory bus, it is too slow to generate long traces. Since the microprocessor must be stopped for probing and probing is very slow compared to normal execution, capturing long traces would require exorbitant amounts of time.

Some CPUs have hardware facilities for automatically generating trace information. For example, the Pentium family microprocessors generate a special bus cycle, a branch trace message, that shows the source and/or destination address of a branch [Col97]. If we record only traces, we can reconstruct the instructions executed within the basic blocks while greatly reducing the amount of memory required to hold the trace.

There are three major methods for generating a trace by software: PC sampling, instrumentation instructions, and simulation. The PC sampling technique uses the operating system or some other facility to periodically interrupt the program and save away the current PC value.

This method has the same limitations of any sampling method - if the sampling period is too slow, we may miss important behavior; in particular, the wrong sampling interval can be a multiple of some periodic behavior in the program that could be totally missed. Alternatively, instructions can be added to the program that will write out information to the trace buffer.

The instructions required depend on the information to be captured. The simplest case is to add code to the start of each basic block that remembers the program execution going through that point. Traces are often used to analyze cache or superscalar behavior, in which case the instrumentation code may need to save additional information about the state of the CPU.

Software-based trace generation methods do add execution overhead time to the program, but this usually does not affect the measurement. Simulation methods interpret the instructions to simulate the CPU, and they can generate the required trace information at the same time.

Obtaining a representative trace requires some knowledge of what the program does. Someone needs to supply inputs that properly exercise the program. Program users should have knowledge of the types of data typically presented to the program.

However, because they may not know which data inputs will cause worst-case behavior, some collaboration between the program designers and users may be necessary. The techniques to be described later in this series can be used to measure how thoroughly a set of inputs covers the program's control and data flow, giving you an idea of the representativeness of your traces.

< Previous
Page 1 of 4
Next >

Loading comments...