Performance and speed in embedded design, particularly multi-core systems, depends of the specifics of the application.
Lawyers often get a lot of criticism and ridicule when they argue back
and forth about the meanings of words, even something as apparently
obvious as the word "Is" especially in the context of a legal document
or proceeding.
According to Chris Fournier, who is teaching a class titled "Analyzing Embedded Multicore Processor and
System Capabilities (ESC-207), " at the Fall 2008 Embedded
Systems Conference in Boston, when it comes to such things as the
definition of "fast," it is also important for embedded systems
developers to pay attention to context when assessing performance and
benchmarking multi-core systems.
Fournier, a senior product marketing manager in AMD's embedded
computing solutions division, said that switching to multi-core
technology presents as many challenges as it does benefits.
"One of those challenges is simply analyzing the potential
performance benefit of a multicore processor, system, or
system-on-a-chip (SoC)," he said. " How much faster will a multicore
approach be? Guessing and rules of thumb don't work here.
"Putting multiple processor cores into a single chip doesn't
automatically guarantee big multiples of processing power. Two cores
aren't always twice as fast; eight cores definitely won't be 8x faster.
Furthermore, there's no guarantee that a multicore processor will
deliver any dramatic increases at all in your system's capabilities,
its computing resources, or its throughput."
Just as the definition of the word "fast" depends on the way it is
being used and where it is being used, assessing the performance of a
multicore system depends on many factors. In the context of assessing
the performance of a multi-core system, he said, the benchmarks that a
developer chooses depend on the context in which they are used and must
test for and isolate each of the factors that may impact on
performance.
"So multiple different benchmarks are essential to get even a rough
approximation of a multicore system's behavior," said Fournier. "For
this reason, the much-abused Dhrystone MIPS numbers - which aren't even
useful for single-core performance - are particularly suspect when it
comes to describing multicore systems.
"As always, the very best benchmark is your own application, but
short of porting your code to every potential platform, using
industry-standard benchmarks can offer a close approximation."
To be accurate and useful, a multicore benchmark must measure three
fundamental areas: data throughput, computational throughput, and data
decomposition. Data throughput shows how well an application can scale
with more data inputs.
This can be accomplished by duplicating the same computation and
applying it to multiple different datasets. Real world examples of this
include the decoding of multiple different JPEG images (as may occur
when viewing a web page), decoding multi-channel audio, or running a
VoIP application with multiple channels whereby each channel receives
different input data.
Computational throughput, in contrast, involves tests that can
initiate more than one task at a time, implementing concurrency over
both the data and the code. This demonstrates the scalability of a
solution for general-purpose processing.
As an example, consider the execution of MPEG decode(x) followed by
MPEG encode(x), which is similar to what you might find in a set-top
box where the satellite signal is received, decoded, and encoded into a
different quality signal for storing on the hard disk.
Data decomposition is where an algorithm is divided into multiple
threads working on a common data set, demonstrating support for
fine-grained parallelism. In this situation, the algorithm could be
working on a single combined audio/video data stream, but the code can
be split in such a way so as to distribute the workload among different
threads, each of which could be handled by a different processor core.
In his class, Fournier focuses much of his attention on how to use
new industry-standard benchmark tests from the Embedded Microprocessor
Benchmark Consortium (EEMBC) to demonstrate and assess a multicore
system's behavior across a variety of scenarios.
"These can help you predict how your own multicore system will
function," he said. "Initial results from these benchmarks reveal some
interesting secrets that, we hope, will encourage you to pay close
attention to the combined effects of the multicore processor, the
memory subsystem, the operating system, and other system-level
characteristics.
To register to attend this and
other classes and events at the conference, go to the Fall ESC
Registration Page and sign up.