More about multicores and multiprocessors

November 01, 2005

Bernard Cole-November 01, 2005

We've collected the most recent how-to and technical articles from on multithreading, multicores, multiprocessor-on-chip (MPoc), and multiprocessor system designs. We're constantly updating our lists of articles and industry links:'s technical articles's news/features
Recent papers on the Web
Industry resources
Keep checking back to see what's new and let us know if you have links to add or content to contribute. For upcoming activities in the industry relating to multicore design, go to the Multicore Association Web site.
-Bernard Cole Site Editor
(602) 288-7257 technical articles
Back to menu

NEW!! Software defined silicon: why can't hardware be more like software?
It can, even though next generation multicore designs mix programmable logic, CPU blocks and dedicated logic. It requires a new approach to architectural design - software defined siliocn.

NEW!! Analysis: XMOS multicore chip offers low cost programmable I/O
BDTI analyzes XMOS Semiconductor's  XS-1 family of multicore chips offering programmable  intterfaces.

NEW!! Debugging a shared memory problem in a multi-core design with virtual hardware

This article demonstrates how a virtual platform can be used to debug a shared memory problem in a multi-core design

NEW!! Embedded multicore reversing DSP-GPP convergence
As multicore chips make waves in the embedded industry, the convergence of DSP and non-DSP processors has changed to a renewed divergence.

NEW!! High level parallel programming model simplifies multicore design
Use of a high level programming model greatly simplifies software development for multicore processors, including heterogeneous multicore processors.

NEW!! Multicore analysis made easy with Nexus 5001 debug spec
If you are looking for a nextgen follow-on to the JTAG interface for your multi-core design, take a look at the Nexus 5001 debug spec, which supports the use of high bandwidth interfaces to efficiently transport data between silicon targets and debug tools

NEW!! Implementing multicore designs using Advanced MC
Tim Van De Walle takes you through the reasons you should implement your next multiprocessor based system using AdvancedMC and AdvancedTCA.

NEW!! Useful design patterns for building embedded multicore systems
Anderson MacKay provides a brief tutorial on design patterns you may find useful as starting points for thinking about how to implement a multicore into your embedded system.

NEW!! There's nothing new about multicore mania
The recent introduction of multicore architectures has been causing a sprising amount of up roar. But multiprocessing has been around for decades.

NEW!! Multicore systems-on-a-chip can handle embedded designs
Tailoring greater numbers of dedicated processors to discrete tasks and types of tasks provides advantages over traditional homogeneous concurency when ensuring the simultaneous, reliable performance of both trivial and system crtical tasks.

Multicore puts the screws to parallel programming models

Leaders in mainstream computing are intensifying efforts to find a parallel programming model to feed the multicore processors already on chip makes drawing boards.

Fast virtual platforms can open up multicore software development
Virtual platforms aid the short-term market takeoff of multicore architectures as well as their long-term acceptance..

Multicore software development: Fact and Fiction
David Kleidermacher sorts through the myths and realities of doing software development for multicore designs and grades the various standards that are emerging to address the challenges embedded designers face.

Is multicore hype or reality?
Multicore processors are here to stay but memory is a bottleneck.

Achieving higher performance in a multicore-based packet processing design.
Michael Coward guides you through the design trade-offs in selecting the memory subsystem archtecture that gets the most performance out of a multicore-based packet processing engine.

Partitioning applications across multiple cores
The multicore processors used in today's networking equipment commonly target enterprise-level access routers, raising questions about partitioning applications in such ways as to most effectively take advantage of multi-core capabilities.

Making life easier for multicore SoC software developers
Putting multiple processors on a single chip or on a single board has enabled embedded systems hardware developers to provide more features and higher processing speeds using less power. But for software developers - and vendors - this trend presents a daunting set of challenges.

Making the right architectural and tool choices in your multicore design
An effective multicore design strategy requires the use of software tools that allow programmers to focus on the design elements themselves, not the precise details of exactly how they are implemented.

*Common multicore parallel programming problems and their solutions
In a four part series, Shameem Akhter and Jason Roberts survey common multi-core programming problems and provide some insight into their solution.
Part 1: Threads, data races, deadlocks and livelocks.
Part 2:  Heavily contended locks
Part 3:  Non-blocking algorithms, ping-ponging and memory reclamation
Part 4:  Memory, cache issues and consistency

Multicore gives more bang for the buck
Each new process geometry and microarchitecture delivers successively less in terms of performance gains. It is simply no longer possible to deliver Moore's Law by going faster.

*Streaming video with "time slice" multicore-friendly processing eliminates dropped frames
David Workman of Kulabyte describes how to improve streaming video based on H.264, MPEG-4, MPEG-2, On2 and most other codecs by eliminating droped frames and improving  bandwidth efficiency.

*Revisiting heterogeneous versus homogeneous
Jeff Bier looks at the trade-offs in suing homogeneous versus heterogenous processing elements in multicore chips for DSP applications.

*BDTI benchmarks the picoChip PC102

BDTI has just released the first independent benchmark results comparing picoChip's massively parallel PC102 chip to that of high end DSP processors and FPGAs.

*Multi-Core Processors: Driving the evolution of automotic electronics architectures
According to Infineon's Patrick Leteinturier, the use of multicore architectures will increase in automotive applications because of the growing need for increased performance with lower power consumption.

*Making the transition from sequential to implicit parallel programming
Despite Microsoft's view that a parallel programming model for multiprocessing is 5 to 10 years away, Rishiyur Nikhil and Arvind believe we can't wait and delve into the options available now. In this series of eight articles, they look at the alternatives: sequential versus parallel programming, procedural versus declarative and functional, explicit versus implicit.
Part 1: How sequential languages obscure parallelism
Part 2: How to achieve parallel execution
Part 3: Explicit parallel programming with threads and locks
Part 4: Explicit parallelism: message-passing programming
Part 5: Implicit parallel programming: Declarative languages

Part 6: So, why aren't we using functional languages yet?
Part 7: pH: an implicitly parallel, declarative language
Part 8: Turning parallel Haskell (pH) into a production language

*Embedded software stuck at C
Embedded software developers are slowly moving to multi-core architectures, but they lack the needed standards and will makd the transition without much help from parallel programming languages, said a panel of experts at the conference.

* Demystifying multithreading and multi-core
Is multithreading better than multi-core? Is multi-core better than multithreading?

*Massively parallel processsors for DSP, Part 2
In Part 2, BDTI looks at innovative new tools for massively parallel processors

*Using OpenMP for programming parallel threads in multicore applications
In this four part series, Intel's Shameem Akhter and Jason Roberts present the case for the OpenMP API as a way to code for highly parallel muticore and multithreaded designs.
Part 1: The challenges of threading a loop
Part 2: Managing shared and private data
Part 3: Performance oriented Programming
Part 4: The OpenMP library functions and how to use them

*Accelerate system performance with hybrid multiprocessing and FPGAs
Multiprocessing is becoming a key differentiator for FPGA-based processor architectures.

* Multicore microprocessors and embedded multicore SoCs have different needs
Steve Leibson of Tensilica assesses the tradeoffs between multicore microprocessors and embedded multicore SoCs and makes the case for dedicated tailored processors rather than general purpose architectures as the best alternative.

* Defining standard Debug Interface requirements for OCP-compliant multicore SoCs: Part 2
In the second in a two part series, the OCP Debug working group describes the work being done to update the Open Core spect to reflect the needs of complex uniprocessor and multicore Socs. This week: how will the OCP multicore interface will be used.

 * Defining standard Debug Interface requirements for OCP-compliant multicore SoCs: Part 1
In the first in a two part series, the OCP working group describes work being done on the spec to reflect the needs of complex uniprocess and multicore SoCs.

* Threading and parallel programming constructs used in multicore systems development: Part 3.
In the final part in a three part series, Intel's Shameem and Jason Roberts discuss condition variables, messages and how flow control constructs and how they can be used in a parallel programming environment.

 How to build a consistent mental model for reasoning about concurrency
It is necessary to build a consistent mental model for concurrency  before implementing it in your software design.

* Python NetWork Spaces and Parallel Programs
Python and NetWorkSpaces make it easy to create and experiment with parallel programs without requiring specialized tools or hardware.

* Threading and parallel programming constructs used in multicore systems development: Part 2. In the second in a three part series, Intel's Shamem Akhter and Jason Roberts deal with parallel programming constructs using synchronization, critical sections and deadlock in implementing multithreading on multicore designs.

* Threading and parallel programming constructs used in multicore systems development: Part 1
In the second in a three part series, Intel's Shamem Akhter and Jason Roberts deal with parallel programming constructs using synchronization, critical sections and deadlock in implementing multithreading on multicore designs.
Applying the fundamentals of  parallel  programming to multiprocessor  designs, Part 1
In the first in a series, Shameem Akhter and Jason Roberts of Intel provide details on how to break up task threads in a parallel programming environment so that multiple operations can proceed simultaneously.
*Applying the fundamentals of parallel programming to multiprocessor designs Part 2
Shameem Akhter and Jason Roberts of Intel provide details on how to break up task threads in a parallel programming environment so that multiple operations can proceed simultaneously.
*Optimizing Software for Multicore Processors
Multicore processors present the challenge of deciding how to validate and optimize code for performance gains.
*Understand packet-processing performance when employing multicore processors
Your cache configuration can have a bigger impact on overall multicore system performance than you expect.
* Multi-threaded debugging techniques
Several general principles can be applied to debugging multi-threaded software applications.
*Get multicore performance from one core.
An SoC with a multithreaded virtual multiprocessor might be just what you're looking for.
* Going multicore presents challenges and opportunities
Performance and power efficiency are key advantages, but they're also challenging as the number of cores increases.
Excerpts from Wayne Wolf's book High-Performance Embedded Computing. This five-part series describes the differences between running software on embedded multiprocessors versus general purpose systems and the precautions that must be taken.
Part 1, The role of the operating system
Part 2, Multiprocessor Scheduling
Part 3, Event-driven multiprocessor scheduling analysis
Part 4, What's different about multiprocessor software?
Part 5,  Achieving multiprocessor quality of service?
Excerpts from the book Customizable Embedded Processors. In this four-part series entitled "Using sub-RISC processors in next generation customizable multi-core designs," the authors state the case for sub-RISC processing elements as the natural multicore SoC building block.
Part 1, Concurrent architectures, concurrent applications.
Part 2, Generating a multicore architecture from the instruction set.
Part 3, Deploying applications with Cairn.
Part 4, IPV4 Forwarding Design Example
* A three-part series "Techniques for debugging an asymmetric multicore application" from Intel's Julien Carreno.
Part 1, A typical asymmetric multi-core application
Part 2, Tools for debugging
Part 3, typical multicore debugging problems
* A seven-part series entitled "The challenges of nextgen multicore networks-on-chip systems" based on Luca Benini and Giovanni De Micheli's book Networks On Chips.
Part 1, Why on-chip networking?
Part 2, SoC objectives and NoC needs
Part 3, Basic NoC approaches
Part 4, Programming issues and approaches
Part 5, Task-level parallel programming on multicore Networks-On-Chip
Part 6, Communications-exposed programming
Part 7, Tools for nextgen multicore networks-on-chips.
* Programming the Cell Processor
Our authors present algorithms and strategies they've used to make breadth-first searching on graphs as fast as possible on the Cell multicore processor.
* Demystifying ESL for embedded systems designs
While the definitions of ESL may vary, the end result should be the same, namely letting developers of multiprocessor and multicore systems analyze their designs at a higher level of abstraction.
* Taking the first step towards MPSoC design with network-on-chip methodologies
Marcello Coppola and Carlo Pistritto describe the details of ST Micro's network-on-chip interconnect topology and provide perspective on how NoCs will solve some of the troubling on-chip traffic jams.
* Designing low-power multiprocessor chips
Chip designers face the challenge of reducing the number of gates in a design and implementing efficient architectures to reduce die size and the total power consumption of a system.
* Parallel processing for multi-core DSPs 
Modern video-processing systems running multiple applications such as image processing, compression and content analysis force systems designers to use multiple DSP chips, FPGAs and a system controller, but chip-level software tools don't address the system integration issues.
* Designing custom embedded multicore processors
There are "multi" paths a designer can take to get the needed performance.
* Functional TLM simplifies heterogeneous multiprocessor software development
Virtual prototyping technology is emerging that allows the creation of a high-performance, functional software model of an embedded multicore system that fully mirrors the hardware functionality
* Tips for effective usage of the shared cache in multi-core architectures
Tian Tian of Intel provides some guidelines on what to do and what to avoid when implementing shared cache in your multi-core based design.
* Using PCIe in a variety of multiprocessor system configurations
Spanning the range from a uni-processor I/O interconnect for desktop systems to a backplane fabric supporting multiple processors for communications, PCIe is the only serial interconnect needed for inside the box designs.
* The software industry needs to adapt--and soon--to multicore chips
The big question is how -- and how soon -- the software industry will step up and produce applications that can take advantage of multiple cores.
* Multicore faces a long road
This may go down as the year the electronics industry woke up to the full breadth and significance of the trend to multicore processors.
* Why Multiprocessor Systems Need CORBA
CORBA enables software components in a multiprocessor system to easily communicate--regardless of what language they are written in, what OS they run on, or where they are located. Even better, COBRA makes it easy to move functionality between DSPs, GPPs, and FPGAs.

*Achieving multicore performance in a single core SoC design using a multi-threaded virtual multiprocessor: Part 2
*Achieving multicore performance in a single core SoC using a multi-threaded virtual multiprocessor: Part 1
*Embedded multicore needs communications standards
*Development and Optimization Techniques for Multicore Processors
*Effective use of RTOS programming concepts for advanced multithreaded architectures
*Leveraging multi-core processors with graphical system design tools
*Multicore solutions proliferating
*Multicore: Sell it simple
*Needed-clear thinking about multithreading and multi-core
*Using a multicore RTOS for DSP applications
*Programming heterogeneous multiprocessors
*When GHz don't add up
*Design and verification strategies for complex systems
*Providing more JTAG debug visibility into multicore System on Chip MCUs
*Why not outsource the interconnect?
*Programming the Cell Broadband Engine
*A glimpse inside the Cell processor
*Taming the Hydra
*Tackling memory allocation in multicore and multithreaded applications
*21st century multiprocessor design: Part 1
*Tutorial:How to analyze your multiprocessing options Part 2 - Best Practice
*Low-power, dual-ports for inter-processor communications in next-generation handsets
*Tutorial:How to analyze your multiprocessing options " Part 1
*Multithreaded Programming Quickstart
*Convergent processors solve development challenges
*The Eclipse Device Software Development Platform Target Management
*Software performance considerations when using cache
*SoC processor is set for the big picture
*Advanced Processor Features and Why You Should Care Part 2
*Designing with an embedded soft-core processor
*Advanced Processor Features and Why You Should Care Part 1
*Making the Most of Multi-Core Processors Part 2
*Debugging real-time multiprocessor systems Part 1
*Common programming models for use on a dual-core processor
*Software Design Issues for Multi-core/Multiprocessor Systems
*Making the Most of Multi-Core Processors: Part 1
*Debugging real-time multiprocessor systems: Part 2
*Debugging real-time multiprocessor systems: Part 1
*Common programming models for use on a dual-core processor
*LINX: an open source IPC for distributed multicore embedded designs
*Applying distributed system concepts to embedded multiprocessor designs Part 3
*Using softcore-based FPGAs to balance hardware-software needs in a multicore design
*Putting Multicore Processing in Context Part 2
*Virtual system prototypes ease embedded multicore design
*Simulating and debugging multicore behavior
*Applying distributed system concepts to embedded multiprocessor designs: Part 2
*Applying distributed system concepts to embedded multiprocessor designs: Part 1
*Dealing with the design challenges of multicore embedded systems
*Using dual port interconnect to resolve multiprocessor system bottlenecks
*Putting multicore processing in context: Part One

*Using software synthesis for multiprocessor OS and software development
*Developing DSP code on converged hybrid DSP/RISC cores
*How to make your asymmetric multiprocessor design OS and CPU independent
*How to adapt traditional RTOSes to symmetric multiprocessing
*Thread versus task management in a dual mode DSP/RISC RTOS environment
*Simplify your multiprocessor-based network design with multicore FPGAs
*What Amdahl's Law can tell us about multicores and multiprocessing
*Using an asymmetric multiprocessor model to build hybrid multicore designs
*Using system services for real time embedded multimedia applications
*Choosing the right multiprocessor development tools
* Use virtual prototypes to model multiprocessor system power needs
*Designing supersystems-on-chip(SSoC)
*Extreme partitioning
*Multiprocessor design for SoCs
*Getting the most from multiprocessor SoC design
*Subtract software costs by adding CPUs

News and features stories
Back to menu

NEW!!! Multi-cores, software's Gordian Knot and the Alexandrian solution
NEW!!! Group drafts virtual machine standard
* Microsoft: Parallel programming model ten years off
NEW!!! Multicores, software tools, languages and the Charge of the Light Brigade
Intel readies papers on programmable multicore architectures
AMD unveils spec to ease pain of multicore software development
AMD will extend X86 for parallelism

*Cell Processor makes computing more connected
*Taming the Hydra
*Rivals may vie at Multicore Expo
*Enea's Inter Process Communications protocol now open source
*Making life easier for multicore SoC software developers
*The state-of-play in multi-processor and reconfigurable computing
*It's data-centric versus data flow CPUs
*CPU makers turn to parallelism to deal with data flows
*The end of Moore's law
*New microarchitectures, from the ground up
*Software limits multi-core ICs, panelists say
*Back to the drawing board on multicores, multiprocessing

Papers on the Web
Back to menu
, et al., "A Single Chip, 1.6 Billion, 16-b MAC/s Multiprocessor DSP", IEEE Journal of Solid State Circuits, Vol. 35, No. 3, 2000, pp. 412-424.

M. Dall'Osso, G. Biccari, L. Giovannini, D. Bertozzi and L. Benini, "Xpipes: A Latency Insensitive Parameterized Network-on-Chip Architecture for Multiprocessor SoCs", International Conference on Computer Design, 2003, pp. 536-539.

Diaz-Nava and A. A. Jerraya, "Multiprocessor SoC Platforms: A Component-Based Design Approach," IEEE Design and Test of Computers, Vol. 19, No. 6, November - December 2002, pp. 52 - 63.

F. Gilbert, M. Thul and N. When, ""Communication Centric Architectures for Turbo-decoding on Embedded Multiprocessors", Design and Test in Europe Conference 2003, pp. 356-351.

P. Kongetira, K. Aingaran and K. Olukotun, Niagara: a 32-way Multithreaded Sparc Processor," IEEE Micro, Vol. 25, No. 2, 2005, pp. 21-29.

M. Loghi, F. Angiolini, D. Bertozzi, L. Benini and R. Zafalon, Analyzing On-Chip Communication in a MPSoC Environment," Design and Test in Europe Conference (DATE) 2004, pp. 752-757.

Riad Ben Mouhoub and Omar Hammami, "MOCDEX: Multiprocessor on Chip Multiobjective Design Space Exploration with Direct Execution," EURASIP Journal on Embedded Systems, Volume 2006 (2006).

D. Pham, et al., "Overview of the Architecture, Circuit Design, and Physical Implementation of a First-Generation Cell Processor," IEEE Journal of Solid-State Circuits, Vol. 41, No. 1, January 2006, pp. 179 - 196.

M. Rutten, et al., "Eclipse: Heterogeneous Multiprocessor Architecture for Flexible Media Processing,"
Parallel and Distributed Processing Symposium, Proceedings International, IPDPS 2002, Abstracts and CD-ROM, 2002, pp. 39 - 50.

Herb Sutter, A Fundamental Turn Towards Concurrency in Software, Dr. Dobbs Journal

Herb Sutter and James Larus, Software and the Concurrency Revolution, ACM Que Magazine.

The trouble with Locks. Herb Sutter, Dr.Dobbs Journal

Confonting Parallelism: The View From Berkeley, Q&A with David Patterson and John Shalf, HPCWire.

Industry resources
Back to menu

Multicore Association
Further information about upcoming activities in the industry relating to multicore design.

Multicore Programming Fundamentals
The National Instruments Multicore Programming Fundamentals Series is a collection of technical content and white papers on using the company's graphical programming methodology to architect code and leverage real-time symmetric multiprocessing with multicore systems.

The Landscape of Parallel Computing Research: A View From Berkeley
Blogs and wikis from U.C. Berkeley's Electrical Engineering and Computer Sciences department, College of Engineering.

European Network of Excellence on High-Performance Embedded Architecture and Compilation.

Manycore Computing Workshop, 2007
A workshop held in conjunction with the ACM International Conference on Supercomputing with presentations on multiprocessor and multicore design as well as parallel program development.

< Previous
Page 1 of 5
Next >

Loading comments...