Using open-source GNU, Eclipse & Linux to develop multicore Cell apps: Part 3 -

Using open-source GNU, Eclipse & Linux to develop multicore Cell apps: Part 3

You've compiled and run your multicore Cell application, but the development process doesn't end there. If runtime errors crop up, you need a way to step through the executable and determine which lines of code produced the errors.

The Cell SDK provides two debuggers for this purpose: ppu-gdb and spu-gdb, and both closely resemble GNU's gdb debugger. Even if the application runs without errors, it might not meet performance requirements.

In this case, you need to locate problems like pipeline stalls, branch miss percentages, and direct memory access (DMA) bus overload. No debugger can give you these statistics, but IBM's Full-System Simulator for the Cell processor, called SystemSim, provides all this information and more.

This part in this series describes SDK's debuggers and SystemSim in detail. It explains the tools' commands and gives examples of how they're used. Keep in mind, however, that the SDK also provides an integrated development environment (for Linux) that enables point-and-click debugging and simulation. If command lines make you nervous, you may want to wait until Part 5 in this series.

Debugging Cell Applications
The concept of debugging is simple: run an application until a specific line of code is reached or a specific condition occurs, then halt. Read the state of the processor and step through succeeding lines.Keep examining the processor's state until you can determine what's producing the error.

Just as gcc is the preeminent open source tool for building applications, gdb is the preeminent tool for debugging them.The Cell SDK provides two debuggers, both based on gdb. ppu-gdb debugs applications built for the PPU and spu-gdb debugs applications built for the SPU.

Both use the same commands, which are the same commands used by gdb. If you're already familiar with these commands, you might want to read Part 4 in this series, which describes the Full-System Simulator.

Debugging SPU Applications with spu-gdb
This section concentrates on spu-gdb for three reasons. First, its commands are essentially the same as those used by ppu-gdb. Second, the SPU architecture is simpler than the PPU's, so it's easier to analyze its registers and memory.Third, SPUs perform the brunt of the Cell's number crunching, so it's very important to get your SPU applications working properly.

A PPU or SPU application can be debugged only if it was compiled with -g and without optimization.The -g option inserts debug information into the executable and makes sure that each line can be executed separately. If you try to debug an optimized application, its breakpoints and watchpoints won't work correctly. A typical spu-gdb session consists of seven main stages:

1. Enter spu-gdb followed by the name of the executable.This starts the debugger.

2. At the (gdb) prompt, use break to create a breakpoint or watch to create a watchpoint.

3. Enter run.This executes the application until the breakpoint location is reached or the watchpoint condition becomes valid.

4. When the application halts, use commands such as info to examine the state of the processor.

5. Use the step, next, or continue commands to execute succeeding lines.

6. Repeat previous steps until the bug is discovered. Enter run to complete the application's execution.

7. Enter quit to end the debugger session.

To effectively use spu-gdb from the command line, you need to understand its basic instructions. This discussion divides them into three categories: commands that set breakpoints and watchpoints, commands that read processor information, and commands that control the execution of the application.

Breakpoints and Watchpoints
Before you can analyze the conditions that cause a bug, you need to halt the application at the right place. Breakpoints are created with the break command, and halt the application when a location in code (line number, function name, instruction address) is reached.

Watchpoints are created with watch, and halt the application when a specific condition occurs. Table 4.1 below lists these commands and a number of different usages.

Table 4.1 Breakpoint/Watchpoint Commands

For example, break main halts the application as the main function starts and break 20 halts the application as it reaches line 20.The command watch x halts the application when x changes value, and rwatch x halts the application when x is read.The break command can be shortened to b and watch can be shortened to w.

Reading Processor Information
Once the application halts, the next step is to examine the processor's state. This state information includes variable values, register contents, and data stored in memory.The three basic commands are where, print and info, and Table 4.2 below lists them and a few of their variations.

Table 4.2 Debug Information Commands

If x is a variable in scope, print x command shows its value, and print &x shows its address. The info registers command lists the contents of all the processor's registers, and info registers num displays the content of register num. If you ever need to get your bearings during the course of a debug session, where full displays the current function, line number, and local variables.

spu-gdb recognizes abbreviated forms of these commands: i for info and p for print. For example, the command i r 20 is the abbreviated form of info registers 20, and displays the contents of Register 20. Similarly, entering p count will tell you the current value of the variable count.

Controlling Application Execution
A debug session commonly requires gaining information about the processor at many different points in its operation. This makes it necessary to keep close control over which lines of code are processed. The commands in Table 4.3 below provide this control.

Table 4.3 Debug Start/Stop Commands

It's important to understand the difference between step and next. step executes the current instruction and proceeds to the next instruction. If the current instruction is a function call, step enters the function and proceeds to its first line.

next is similar, but if the current instruction is a function call, next executes the function and proceeds to the line after the function call. In other words, step steps into functions and next steps over them. step, next, and continue are commonly abbreviated s, n, and c, respectively.

An Example Debug Session
Now that you've seen the spu-gdb commands, it's time to look at an example of how they're used in a real debugging session. Listing 4.1 below presents the source code to be examined. The compiled application finds prime numbers between 2 and N using the Sieve of Eratosthenes.

Listing 4.1 Sieve of Eratosthenes: spu_sieve.c

The first loop iterates through num_array, and if i hasn't been marked as a composite number, the value of num_array[i2] is marked along with every following num_array value whose index is a multiple of i. The second loop iterates through the array and lists all values of i that haven't been marked. These are the primes between 2 and the size of num_array.

To demonstrate how the debugger works, a session was run, downloadable in PDF form, which creates a breakpoint at the first function, main. It runs to the first line of executable code and steps to the next. It executes ten more lines with next 10, finds its position with where, and sets another breakpoint at Line 19. Finally, the session lists the breakpoints, continues to the second breakpoint, and runs to completion.

The first loop iterates through num_array, and if i hasn't been marked as a composite number, the value of num_array[i2] is marked along with every following num_array value whose index is a multiple of i.The second loop iterates through the array and lists all values of i that haven't been marked.These are the primes between 2 and the size of num_array.

When setting breakpoints at line numbers, it's a good idea to identify the filename, as in the preceding example. Otherwise, the debugger may halt in a source file you hadn't expected.

The Cell SDK debuggers can assist you in finding the source of an error, but they aren't as helpful at improving an application's efficiency. By this I mean reducing branch misses, reducing DMA traffic, and making the best use of a processor's pipeline. ppu-gdb and spu-gdb can't provide this kind of low-level information, but IBM's Full-System Simulator can.

SystemSim: a full-system simulator for the Cell Broadband engine
Up to this point, the discussion has centered on building, running, and debugging applications on an actual Cell processor.

This section switches gears and focuses on simulating applications with IBM's Full-System Simulator for the Cell Broadband Engine. This is a powerful tool, and it's not just for those who can't access a Cell device; it's for any developer who wants a thorough understanding of how the Cell processes code.

The simulator, hereafter referred to as SystemSim, gives users nearly absolute oversight of the PPU and all eight SPUs—an experience that PlayStation 3 developers will never otherwise have. Like a debugger, the simulator displays the contents of registers and memory. But it also keeps track of additional data like bus usage, address translation, stack pointer location, and DMA communication.

With SystemSim, you can start the simulated Cell with no operating system whatsoever. This is called standalone mode. If you choose this, there are a few restrictions:

1) No access to virtual memory operations.
2) All applications must be statically linked.
3) Some Linux system calls are unsupported.

The presentation that follows assumes that Linux is installed on the simulated Cell. In this case, the restrictions don't apply.

SystemSim can be configured to provide cycle accurate timing, which means you can test applications on the simulator with the same confidence as you can with an actual Cell. Further, the simulator keeps track of cycle counts on the PPU, SPUs, and DMA buses, so you can compare the timing on one processing unit to that of another.

SystemSim's chief drawbacks are that it has too many features and displays too much information.The goal of this section is to provide a basic tour of the simulator's capabilities, so don't be concerned if you don't understand all of it.

SystemSim Configuration
IBM originally created SystemSim to provide low-level simulation of its PowerPC line of processors. It isn't hard-coded for any particular device, but uses a configuration file to specify the processor's characteristics. In SystemSim terms, this configuration file defines a machine type.

When the Cell SDK simulator starts on a PC running Linux, it reads the configuration file systemsim.tcl in /opt/ibm/systemsim-cell/lib.This file defines the resources, operation, and timing of the Cell processor.

By accessing this file, users can configure the parameters of the simulated Cell. For instance, the size of the simulated system memory is set to 256MB, just as in the PS3, but this value can be increased as needed.

SystemSim communicates with the outside world using Tcl (Tool Command Language), and its capabilities can be extended with additional Tcl scripts.When it initializes, SystemSim searches for scripts in the /opt/ibm/systemsim-cell/lib directory and in the common and ppc subdirectories. By looking through these directories, you can see how Tcl scripts access SystemSim.

Starting and Running SystemSim
The script that starts the simulator is called systemsim and it's located in /opt/ibm/systemsim-cell/bin. For the discussion that follows, it's a good idea to add this directory to your PATH variable and create an environment variable called SYSTEMSIM_TOP that points to /opt/ibm/systemsim-cell.This can be done by inserting the following lines into .bash_profile in your home directory:

export PATH=/opt/ibm/systemsim-cell/bin:$PATH
export SYSTEMSIM_TOP=/opt/ibm/systemsim-cell

Then start the simulator with

systemsim -g -q

The -q option tells SystemSim to run in quiet mode, and the -g tells the simulator to create a GUI. By default, this command invokes the startup script systemsim.tcl in $SYSTEMSIM_TOP/lib/cell.You can access a different startup script by using the -f filename option.

As systemsim.tcl executes, three things happen:

1. The terminal in which the script was run becomes the simulator's command window. This sends Tcl commands to the simulator.

2. A new window is created that communicates with the simulated Cell device. This is the console window.

3. A graphical panel appears with buttons and folders. Figure 4.1 below shows what the panel looks like.

Figure 4.1 The SystemSim graphical panel

The graphical panel is divided into two sections. On the left, a directory tree presents the resources of the mysim machine (the Cell processor). On the right, a grid of buttons shows the different commands that can be issued to the simulator. Table 4.4 below lists each of these buttons and their functions, from top left to bottom right.

Table 4.4 Control Buttons for the Full-System Simulator

The first button that concerns us is the Mode button.This controls the precision with which SystemSim simulates the Cell. Click this button and a dialog will present three options: Fast, Simple, and Cycle (Figure 4.2 below ).

Figure 4.2 SystemSim simulation modes

If the Cycle mode is chosen, SystemSim performs cycle-accurate simulation of the Cell's resources.This provides high-precision measurement but demands a great deal of processing power. It takes a great deal of time to install Linux on the simulated Cell in Cycle mode, so for now, choose the Fast mode.

Now you're ready to start the simulator. Click the Go button in the panel and the simulator will install Linux on the simulated Cell.When it finishes, the command window and console window will display their command prompts and await your orders.These two windows serve different functions and receive commands in different languages.

The SystemSim Command Window
When SystemSim starts, the terminal that ran the systemsim script becomes the command window.As Linux finishes booting on the simulated Cell, the command window stops producing output and displays the command prompt: systemsim %. If this prompt isn't visible at first, select the window and press Enter until it appears.Figure 4.3 below shows what the window and its prompt look like.

Figure 4.3 The SystemSim command window

This window sends commands directly to SystemSim and controls the simulator's operation.The command language is based on Tcl, and if you enter puts “Hello command window!”

SystemSim will understand the command and display the string. If you place this command in a file ending in .tcl, such as /tmp/hello.tcl, you can execute the file with the following command:

tclsh /tmp/hello.tcl

These Tcl scripts make it possible to interact with the simulator and other Tcl scripts. It's a good idea to look through the scripts in $SYSTEMSIM_TOP/common to get an idea of how they work.

The SystemSim Console Window
The SystemSim console window is different from the command window in many respects. It's smaller, its font is smaller, and it displays keystrokes more slowly than the command window. Figure 4.4 below shows what the console window looks like.

Figure 4.4 The SystemSim console window

The console window operates more slowly because its input goes through the simulator and into the simulated device.This is an important distinction: The command window sends input to the SystemSim application running on the host system. The console window sends input to a shell on the simulated Cell processor.

Tcl commands won't work in the console window, but you can run regular Linux shell commands such as cp, ls, and mkdir.The home directory is /root by default, and I recommend that you explore the simulated file system:The executables are in /bin and the devices are in /dev.

The console window lets you transfer files between the host system and the simulated Cell. More important, it allows you to run executables on the simulated Cell. This capability is discussed next in Part 4.

Next in Part 4: Compiling and running simulation applications
To read Part 2, go to Building Applications for the Cell Processor.
To read Part 1, go to “Introducing the Cell Processor.”

Matthew Scarpino lives in the San Franciso Bay area and develops software to interface embedded devices. He has a master's degree in electrical engineering and has spent more than a decade in software development. His experience includes computing clusters, digital signal processors, microcontrollers and field programmable gate arrays and, of course, the Cell Processor.

This series of articles is reproduced from the book “Programming the Cell Processor,” Copyright 2009, by permission of Pearson Education, Inc.. Written permission from Pearson Education, Inc. is required for all other uses.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.