Achieving first day multicore SoC software success

Bill Neifert

February 1, 2011

Bill Neifert

Performance and accuracy

Let’s look at performance and accuracy, two interrelated issues.

There is an unavoidable tradeoff between performance and accuracy; design teams give up one to get the other.  But models fast enough for application software development need to be many orders of magnitude faster than RTL and there is no way to get that sort of speedup automatically.  

Just as a designer cannot get from a Spice model to an RTL model by simply removing unnecessary detail, he or she can’t get from an RTL model to a virtual platform behavioral model fast enough to execute application software by simply removing unnecessary detail.

Trying to create a model with both speed and accuracy seems to be the worst of both worlds.  The model either has insufficient accuracy to be used for verifying the interaction of low-level software with the chip (in order to get higher performance) or else, if it has that accuracy, it will be too slow for software developers.

A better approach is to accept this and create both a high-speed model for software development and a 100% accurate model for hardware and firmware debug.

The 100% accurate model can be created automatically from the RTL code.  An integrated ecosystem takes RTL models that are accurate by definition and delivers speedups by optimizing away low-level timing details to produce a cycle-accurate model. This guarantees the fidelity of the model to the actual chip.

Since these models are created directly from the RTL code, they avoid the problems that inevitably arise when the behavior of the accurate model differs from the behavior of the RTL code.  While they are accurate enough for hardware development, they are not fast enough for either software development or for booting up the system to get to a point at which it makes sense to examine the hardware in detail.  For that, high-speed models are still required.

High-speed models need to be created by hand.  Oten, performance gain comes from changing the modeling approach.  Curiously, one result of this is that hardware designers often make poor modelers since they try to model the hardware the way it actually works.

As an example, consider a counter that counts down and interrupts when it gets to zero.  The actual hardware will have a decremeter containing a register that gets clocked on each clock cycle.  However, if the counter is modeled this way, the virtual platform will consume most of its compute resources clocking this register and others like it.

The correct way to model the counter is to work out when, in the future, the device will interrupt and scheduling that with the underlying time management of the virtual platform and then ignore the counter until then.  In the meantime, if the software accesses the register to read its value, the model will need to calculate the value that should be there based on how many clock cycles have passed.  

This is a simple model, but the trick of creating all such models is to keep the model as inactive as possible, only waking up the code when something absolutely essential happens.

High-speed peripheral models, as already explained, are built by hand.  But as more and more of an SoC or electronic system consists of IP blocks, then more and more of the high-speed models already exist.  Companies such as ARM and MIPS create high-speed models for their processors and standard peripherals, and using the integrated ecosystem they also create cycle-accurate models, pre-qualified to work correctly in a specific virtual platform.  A web-based model portal makes these models easily accessible for quick creation of virtual platforms.

High-speed models give software developers what they need, but there is still one missing capability required to make the portfolio of models useful to hardware developers:  the ability to switch between high-speed models and cycle-accurate models.  Accuracy when you need it, performance when you don’t.

A technology known as Swap’n’play performs this switch.  This gives designers a way to use high-speed models to boot the operating system and run whatever software is necessary to get the system into a state where the hardware designer needs to delve into the details.  At this point the virtual platform is checkpointed, the state of the high-speed model is extracted and it is injected into the cycle-accurate model.  The virtual platform continues in cycle-accurate mode and all the signals of interest can be examined.

 

< Previous
Page 2 of 3
Next >

Loading comments...

Most Commented

  • Currently no items

Parts Search Datasheets.com

KNOWLEDGE CENTER