Benchmarking an ARM-based SoC using Dhrystone: A VFT perspective
Additional Code
1. Code to toggle the design port on tester at start and end of measurements and also to signal other points in the run
To generate the desired start pulse output on a design port, code must be added at the very beginning of the inner for() loop.
For the specific ARM-based SoC device being discussed, recall it is a cache-based microprocessor. To ignore the cold start effects associated with "priming" the caches, the inner loop of the benchmark is executed a few times before the loop performance is actually measured. This allows the "steady state" performance of the benchmark to be measured. For this discussion, let the inner loop be executed a total of 10 times, the first 5 to reach the steady state, and the final 5 iterations for the actual performance measurement.
for (Run_Index = 1; Run_Index <= Number_Of_Runs; ++Run_Index)At end of the iterations of the Dhrystone code this is generated:
{
if (Run_Index == 6) {
if ( CORE_TYPE == CM4)
{
PIN_VALUE(PORTx,0);
PIN_VALUE(PORTx,1);
PIN_VALUE(PORTx,0);
}
else { // CORE_TYPE == CA5
PIN_VALUE(PORTy,0);
PIN_VALUE(PORTy,1);
PIN_VALUE(PORTy,0);
}
}
}
if ( CORE_TYPE == CM4)The function is typically inlined into the main inner loop compiler output.
{
PIN_VALUE(PORTx,0);
PIN_VALUE(PORTx,1);
PIN_VALUE(PORTx,0);
}
else { // CORE_TYPE = = CA5
PIN_VALUE(PORTy,0);
PIN_VALUE(PORTy,1);
PIN_VALUE(PORTy,0);
}
Also note that the inclusion of the code executed when Run_Index = = 6 does slightly increase the execution time of the inner loop, but the effects are expected to be small enough to be ignored.
2. Code to toggle the design port to signal start and end of overall tester pattern execution and the signaling of pass/fail banner
In addition to the signaling of the start/end of the inner loop execution, it is also useful to include GPIO pin signaling to identify the start and end of the entire tester pattern as well as the pass/fail status after the self-check code has been executed. In this example, we encoded a static 2-bit value on GPIO pins to provide the following output status:
if GPIO = 0x3, then tester pattern execution has started3. Code to be included in the Cortex-A5 executable to wake up the Cortex-M4 as the secondary core (if applicable)
else if GPIO = 0x2, then variable miscompare vs. expected data was detected
else if GPIO = 0x0, then execution completed successfully
4. Code to provide a starting instruction address for secondary core execute (if applicable)
Execution always starts through primary core and in this code we will go and enable the clocks for cm4,
Information about dual/single core and which core is primary/secondary is provided to the design by driving specifc value into the design through fixed ports. This is done through testbench/VCD.
5. For dual core executions, code is added to provide a CPU-to-CPU "semaphore" variable providing an indicator from the secondary core to the primary core that its execution has completed. The primary core then completes its execution and provides the appropriate GPIO indicators.
This done, the secondary core updates a known system RAM location with a particular value when the execution is finished. This value is continuously being read by the primary core to notify it the secondary core’s code execution is done and now the primary core can also terminate its own execution.
6. SoC specific code needed to configure the device's clocks correctly. This may entail getting system to run at-speed either by locking the system Phase-locked Loop (PLL) or by providing a direct high-frequency at-speed clock on tester.


Loading comments... Write a comment