Debugging a Shared Memory Problem in a multi-core design with virtual hardware - Embedded.com

Debugging a Shared Memory Problem in a multi-core design with virtual hardware

With multicore systems becoming the norm, software developers havefound the debugging of such systems, using a physical hardwaredevelopment board, to be very challenging, particularly when it comesto the integration of applications sharing data across multiple cores.

Over the past few years, the virtual hardware platform concept hasemerged as a key new capability for software developers to improvetheir ability to debug software applications.

Virtual platforms are simulations of the device hardware and theenvironment it evolves in. They represent a new solution for softwaredevelopers to improve their productivity. The benefits of virtualplatforms for software development come from three major areas.

First, they remove the dependency on the physical siliconavailability. Second, they provide a far superior solution for debugand analysis. Third, they provide a simplified and more easily sharableenvironment to the users.

Traditionally, software developers have used three different typesof environments for the execution of the software under development:native compilation to the host development system or an OS simulator,reference development boards, or instruction set simulators. Each ofthem has been used successfully in the context of a simple hardwareplatform.

However, as hardware platform capabilities are increasing withmulti-core support, these approaches are exhibiting some significantlimitations, including limited observability and controllability of thehardware, poor representation of the final device hardware, and limitedscalability. This article will demonstrate how a virtual platform canbe used to debug a shared memory problem on a multi-core platform.

Platform description
The system under consideration is depicted in Figure 1 below . It includes twoprocessor cores and several peripheral elements. One core (ARM926) isused to boot and execute the Linux operating systems and a variety ofapplications. The second core (ARM968) is used to execute an H.264decoding algorithm.

Figure1: A shared memory design with two processor cores and severalperipheral elements, one used to boot and execute the Linux operatingsystem and a variety of applications and the second to execute an H.264decoding algorithm.

Peripherals included in the platform include, interrupt controller,touch screen controller, display controller, ATAPI controller, UART,programmable I/O, timer, clock and memory.

An AHB multi-layer bus isalso used to allow mapping different address regions to the two busmasters. A model of the platform has been created using SystemC astandard hardware modeling language and TLM (Transaction LevelModeling), a standard based modeling methodology.

In addition to the hardware model, the platform comes with hostapplication programs that enable the interactive I/O user interface ina realistic device environment.

Each of these applications can directlycommunicate with the platform model and display the desiredinformation. They include a graphical user interface, connectivity tothe host memory file system, and a terminal window showing, forexample, the Linux boot sequence.

Software development environment
The software development environment provided to the software developercontains:

* The simulation of the hardware platform on which the software canbe downloaded and executed
* A virtual platform debugger” unlike most software debuggers that onlyexamine the state of the processor, a virtual platform debugger can setbreakpoints and watchpoints on every memory element and signal of theentire platform.
* Integration with source code-level debugging software developmenttools such as gdb and Lauterbach.

Figure 2 below provides anoverview of this environment for multi-core debugging, providing anon-intrusive, deterministic and fully controllable developmentenvironment.

The virtual platform simulation performances are such that a fewseconds are needed to simulate the operating system boot and the moviestream is executed at a speed near or faster than real-time. Theseperformances demonstrate that SystemC, a C++ based language perfectlyscales to the performance requirements of software developers.

Figure2: Use of a virtual platform multicore debugging environment can beused to resolve possible shared memory problems illustrated in Figure 1.

As the initial version of our software is compiled and downloaded tothe virtual platform, we quickly observe through our user interfacethat the video stream being displayed only goes for a small period oftime and appears to skip significant sections of the movie beingdecoded.

Using the Lauterbach software debugger and gdb each connected to anARM processor, we can quickly identify that each core is alive, leavingthe potential problem to the H.264 algorithm or the use of the sharedmemory between the two processors. Since this decoder previously workedproperly on a single processor core architecture, we suspect that theproblem is in the use of the shared memory.

Shared memory architecture
A circular buffer (Figure 3 below )is being used in this architecture. The ARM926 reading the video filepasses the data to the ARM968 where the video stream is decoded andsent to the display device. The circular buffer presents anotherchallenge for the software developer.

At any point in time, the ARM926 can write on a portion of thebuffer and the ARM968 can read from another portion. If the ARM926 wereto write in the wrong portion, then it would overwrite data that theARM968 had not yet read, and accordingly, would have the effect ofskipping part of the movie being decoded. A valid and an invalid accessto the circular buffer is depicted below.

Figure3: Valid and an invalid access to the circular buffer

The challenge facing the software developer is that the watchpointshave to be changed after every read or write since the memory areakeeps changing. In addition, watchpoints need to be placed on thecircular buffer itself to gain the right visibility in the behavior.

Debugging Capabilities and Tracingof the Problem
The advantage of using a simulated environment such as that shown in Figure 4 below is that the user canhave full control of the execution of the hardware. This is, of course,as long as this proper control is offered to him. In this case, we areusing the capabilities of a virtual platform debugger, which providevisibility and controllability into any memory element and signals.

In addition, the tool shown in Figure 4 provides a scriptingcapability (based on tcl) enabling its user to take specific actionswhen a breakpoint or watch point is hit. For the debugging of ourproblem, a script that automatically updates the watchpoints after eachread and write will be used.

It creates a sandbox where the ARM926 is authorized to write and theARM968 to read. The script will alert the developer and stop theplatform execution when a memory violation occurs (i.e write and readare done outside the sandbox).

In addition, a very visual and intuitive user interface can becreated with the script. The picture below provides the structure ofthe script and the graphical visualization provided by the virtualplatform debugger.

Figure4: Using a simulated environment provides the developer full control ofthe execution of the hardware.

As we now execute the software using the script, we hit a watchpointand can start using our software debugging capabilities provided by theLauterbach debugger (Figure 5, below )to trace the source of the problem:

* First, we locate the function that was called when the memoryaccess occurred.
* Looking at the stack frame, we identify that the read and writepointer are pointing to the same memory location, and a write wasperformed.
* The function stack enables the identification of the calling functionwhose source code can now be viewed.

Figure5. The Lauterbach debugger can be used to trace the source of thememory conflict problems.

We quickly identify that the buffer count has been hard-coded ratherthan being calculated. The problem is fixed and the re-compiledsoftware is downloaded to the virtual platform showing the properexecution of the software.

Simplifying the Edit-Debug-CompileCycle
The debugging example provided above demonstrates the key capability ofusing a virtual platform to accelerate the edit-debug-compile cycle.The virtual platform allowed us to:

1. Identify the problemearlier and without silicon availability
2. Trace its source by allowinga “watch” to be established inside a memory block (not to be confusedwith CPU watchpoint!) and to dynamically create a “sandbox” to catchbuffer overflow errors
3. Solve the problem by quicklypinpointing the source code error for correction, leveraging theintegration of existing software development tools such as Lauterbachor gdb
4. Validate the solutionthrough the execution of the updated software.

<>Virtual platform tools provide a powerful solution with fastsimulation performance and integration of existing software developmenttools such as debuggers.

In our specific environment we also have access to a powerful toolacting a virtual platform, which unlocks the unique capabilities ofvirtual platforms including, controllability and observability of thesoftware and the platform, as well as, a scriptable solution based on atcl interface.

The integrated solution provides a non-intrusive debuggingenvironment enabling debugging that could not have been efficientlydone with physical hardware.

Marc Serughetti is Vice PresidentMarketing & Business Development at CoWare

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.