Benchmarking an ARM-based SoC using Dhrystone: A VFT perspective

Neha Srivastava, and Aashish Mittal, Freescale Semiconductor

November 4, 2012

Neha Srivastava, and Aashish Mittal, Freescale Semiconductor



Additional Code

1. Code to toggle the design port on tester at start and end of measurements and also to signal other points in the run


To generate the desired start pulse output on a design port, code must be added at the very beginning of the inner for() loop.

For the specific ARM-based SoC device being discussed, recall it is a cache-based microprocessor. To ignore the cold start effects associated with "priming" the caches, the inner loop of the benchmark is executed a few times before the loop performance is actually measured. This allows the "steady state" performance of the benchmark to be measured. For this discussion, let the inner loop be executed a total of 10 times, the first 5 to reach the steady state, and the final 5 iterations for the actual performance measurement.
for (Run_Index = 1; Run_Index <= Number_Of_Runs; ++Run_Index)
{

if (Run_Index == 6) {

if ( CORE_TYPE == CM4)

{

PIN_VALUE(PORTx,0);

PIN_VALUE(PORTx,1);

PIN_VALUE(PORTx,0);

}

else { // CORE_TYPE == CA5

PIN_VALUE(PORTy,0);

PIN_VALUE(PORTy,1);

PIN_VALUE(PORTy,0);

}

}

}

At end of the iterations of the Dhrystone code this is generated:
if ( CORE_TYPE == CM4)
{
PIN_VALUE(PORTx,0);
PIN_VALUE(PORTx,1);
PIN_VALUE(PORTx,0);
}
else { // CORE_TYPE = = CA5
PIN_VALUE(PORTy,0);
PIN_VALUE(PORTy,1);
PIN_VALUE(PORTy,0);
}

The function is typically inlined into the main inner loop compiler output.

Also note that the inclusion of the code executed when Run_Index = = 6 does slightly increase the execution time of the inner loop, but the effects are expected to be small enough to be ignored.

2. Code to toggle the design port to signal start and end of overall tester pattern execution and the signaling of pass/fail banner

In addition to the signaling of the start/end of the inner loop execution, it is also useful to include GPIO pin signaling to identify the start and end of the entire tester pattern as well as the pass/fail status after the self-check code has been executed. In this example, we encoded a static 2-bit value on GPIO pins to provide the following output status:
if GPIO = 0x3, then tester pattern execution has started
else if GPIO = 0x2, then variable miscompare vs. expected data was detected
else if GPIO = 0x0, then execution completed successfully
3. Code to be included in the Cortex-A5 executable to wake up the Cortex-M4 as the secondary core (if applicable)

4. Code to provide a starting instruction address for secondary core execute (if applicable)

Execution always starts through primary core and in this code we will go and enable the clocks for cm4,

Information about dual/single core and which core is primary/secondary is provided to the design by driving specifc value into the design through fixed ports. This is done through testbench/VCD.

5. For dual core executions, code is added to provide a CPU-to-CPU "semaphore" variable providing an indicator from the secondary core to the primary core that its execution has completed. The primary core then completes its execution and provides the appropriate GPIO indicators.

This done, the secondary core updates a known system RAM location with a particular value when the execution is finished. This value is continuously being read by the primary core to notify it the secondary core’s code execution is done and now the primary core can also terminate its own execution.

6. SoC specific code needed to configure the device's clocks correctly. This may entail getting system to run at-speed either by locking the system Phase-locked Loop (PLL) or by providing a direct high-frequency at-speed clock on tester.


< Previous
Page 2 of 4
Next >

Loading comments...

Parts Search Datasheets.com

KNOWLEDGE CENTER