Use virtual prototyping to boot Linux on the ARM Cortex A15 -

Use virtual prototyping to boot Linux on the ARM Cortex A15


SoC development teams worldwide have begun a steady move to a virtual prototype methodology for better accuracy and to accelerate the design process of all kinds of applications. For those of you who aren’t familiar with using a virtual prototype, let’s start with a definition, then take a look at how an engineer recently used virtual prototyping to boot Linux on the ARM Cortex-A15.

Virtual prototypes are fast, functional software models of a system that can execute production code. With benefits ranging from software development to enabling architectural exploration and early functional verification using abstract models, their rising popularity is easy to understand.

Almost every virtual prototype deployment though suffers from a similar problem: The virtual prototype either runs fast while sacrificing cycle accuracy or it is cycle accurate but lacks the speed to develop software.

Some virtual prototypes attempt to solve this problem by sacrificing a bit of speed and accuracy to produce a “best of both worlds” system that claims to have the best attributes of both with none of the downsides. In practice however, this pleases no one because it’s too slow for the software team and not accurate enough for use by architects and firmware engineers. Fortunately, there’s a way to create a single virtual prototype that is both fast and accurate.

I recently worked with an engineer to help him boot Linux on a virtual prototype containing an ARM Cortex-A15. In this case, he was developing a mobile application processor but the same steps apply to almost all complex SoC designs.

In order to get a true measure of the performance of the SoC, the engineer needed to run benchmarks that ran on top of an operating system. Benchmarks included Dhrystone, CoreMark and tiobench, a multi-threaded I/O benchmark used to measure file system performance, on top of Linux. Running benchmarks served two primary purposes. Obviously, results of the benchmark helped determine the relative performance of the device under test (DUT) but also do an effective job of generating large amounts of representative system traffic to stress the system and identify optimization opportunities.

Each benchmark required a significant number of simulation cycles to complete in addition to the huge number of cycles required to simply boot the OS. Because of this large number of required execution cycles, this type of use case is not typically considered with traditional cycle accurate prototypes. Instead, engineers have opted for cycle-approximate models that can lead to inaccurate and un-optimized SoC designs. Or, more often, they have skipped this optimization step entirely during the design phase and waited to run these benchmarks in prototypes when it was too late to make changes based on the results.

Design teams don’t need to accept inaccuracy or wait until designfreeze if they use a virtual prototype. Software from Carbon DesignSystems, for example, allows engineers to do advanced performanceoptimization by leveraging ARM Fast Models for speed and Carbon’s Swap& Play technology for 100% accuracy.

The integration with ARM FastModels enables an engineer to increase simulation performance inselected components during periods of time when accuracy isn’t critical.Swap & Play then enables ARM Fast Model components to be swappedout in favor of their 100% accurate equivalent components when accuracyis required, such as benchmarking. Essentially, this means performancewhen it’s wanted and accuracy when it’s needed.

Inthe system illustration below, the engineer used the Cortex-A15 LinuxCarbon Performance Analysis Kit (CPAK) to accelerate analysis,optimization and verification of the SoC’s performance (Figure 1 ). The CPAKcontains reference hardware and software designs along with analysis anddebug software for the Cortex-A15 processor, a way for him toimmediately begin analyzing performance and power constraints.

Figure 1. The Cortex-A15 Linux CPAK was used to accelerate analysis.

Afterbooting the Linux kernel provided the CPAK, the engineer created a Swap& Play checkpoint corresponding to the start the Dhrystonebenchmark. Instead of simply swapping over to cycle-accurate executionat that point, however, he continued running in the Fast Model-basedsystem. He used SoC Designer Plus’ built-in checkpoint manager to createa variety of additional checkpoints, each representing differentbenchmarks or interesting points of execution.

To obtainaccurate results, he then loaded each of the checkpoints into thecycle-accurate implementation of the CPAK and completed the benchmarkexecution. This enabled him to pinpoint certain areas of the benchmarkfor deeper analysis without needing to execute the entire benchmark incycle-accurate mode. The screen shot in Figure 2 below gives a small sample of thesystem profiling statistics that can be gathered while running thebenchmark.

Figure 2: The virtual prototype is tracking several hardware events and statistics running a benchmark on top of an operating system.

Takeanother look at Figure 1 . Yes, those are actual hardware events andstatistics running a benchmark on top of an operating system with avirtual prototype. What’s displayed here is only a small sampling of thestatistics that can be viewed. For example, synchronized windows can beused to display a number of hardware and software performance metrics.

SoCdevelopment teams have discovered that virtual prototypes eliminate theneed for them to configure a hardware prototype of the system for thislevel of analysis. Furthermore, Swap & Play and accurate softwarecan help ensure correct architectural tradeoffs or an optimized system.

Aprototype can be a reliable gauge, but may impact the project scheduleif the development team needs to re-validate and verify an architecturalchange. This could mean time-to-market delays and loss of revenue.

Ofcourse, the engineer could have opted to over-engineer the chip, butquickly ruled this out because over-engineering can lead to an increasedchip size and extra power consumption, not an option for any processormarket segment.

The SoC development team recentlyimplemented the virtual prototype methodology to boot Linux on the ARMCortex A15 and found that it solved several intractable performance andsoftware problems that previously would have required expensive hardwareprototype solutions. That alone should build a solid case for bringing avirtual prototype methodology into any design environment.

AndyMeier is manager of application engineering at Carbon Design Systems inActon, Mass. Before being promoted into his current position, he servedas a Carbon Design Systems corporate applications engineer. Previously,he worked as a senior verification engineer at SiCortex and averification engineer for Mindspeed Technologies. Meier holds a Bachelorof Science degree in Electrical and Computer Engineering from WorcesterPolytechnic Institute in Worcester, Mass.

(This article has also been published on the EDA DesignLine ).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.