Evaluating platform software architectures for nextgen embedded multicore designs
Imagine: you are the chief software architect of a new embedded design, a decade from now. You are contemplating the microprocessor selected by the hardware guys and bean counters and wondering how in the world you’re going to make best use of its fire power to build the most ambitious product your company has ever conceived.
This SoC has dozens of general purpose processing cores; hundreds of Gbps memory bandwidth across multiple memory controllers; 64-bit addressing; multiple high-speed packet interfaces capable of maxing out numerous 10 gigabit Ethernet interfaces simultaneously; a RAID accelerator; a packet-deduplicator; a compression engine; three-levels of cache; a dizzying array of peripherals (USB, UART, SD card, and more).
In addition it often has a regular expression pattern matching engine; a packet scheduling and routing infrastructure; hypervisor acceleration; a sophisticated security engine with support for symmetric, public key, and hashing; and an amazing suite of on-chip debugging features. The chip reference documentation is many thousands of pages long.
Frightened yet? Well, I’m about to make it worse. Remember the part about being 10 years away? Just kidding. I just described features found on today’s high end multicore network processors from Cavium (OCTEON II), Freescale (QorIQ), LSI Corp (Axxia), and NetLogic (XLP). A quick look at a block diagram for one of these, the Freescale P4080, is enough to make you swoon (Figure 1 below).

Figure 1. Freescale QorIQ P4080 multicore processor
This is just the high level view. Each of the subsystems is extremely complex. Again using the Freescale example, the P4080 has an awesome complement of debugging features (Figure 2, below), if you can solve the halting problem to use them all: on-chip and off-chip instruction and data trace, performance counters, inter-core cross triggers, user-programmable performance and engine-monitoring events that feed the trace logic, etc.

Figure 2. Freescale QorIQ On-Chip Debugging Architecture
These bad boys don’t program themselves. The only hope to maximizing the potential of these processing behemoths is with some ridiculous software smarts. We’re not going to solve the entire problem in this article, but we’re going to talk about the software layer that controls the platform – the operating systems and hypervisors upon which everything else rides. At the very least, the chief software architect needs to understand the major options and some of the key tradeoffs between them.
In all of the following options, we assume that the design includes some non-real-time software (such as management and health monitoring, control plane routing protocols like OSPF, and human-machine interfaces) as well as real-time processing (such as high-speed data processing and low latency device drivers).


Loading comments... Write a comment