Multicore technology training for embedded systems designers -

Multicore technology training for embedded systems designers

Multicore is an inevitable technology, so we should get used to seeingit around. But we know that multicore already garners significantattention, especially amongst engineers deciding on which multicoreplatform to use, or trying to get up to speed on the best techniques toapply when building their applications on a multicore platform.

At the MulticoreExpo this year, we've diligently assembled 60highly-experienced multicore aficionados to teach sessions within 9tracks focusing on the key areas of embedded multicore technology.Following are a few examples of the top-notch sessions in the program,demonstrating the intricate design techniques that you can apply whendeveloping with multicore technology.

Applying Parallelization andLoad-Balancing
Maximizing parallelization within an application has become a key goaland challenge for multicore programmers. Programmers can obtainincreased concurrency by using multi-threading techniques.

In the session “ParallelProgramming Possibilities with the Intel Atom Family (ME745)”,Hillar Consulting's Gaston Hillar, will explain the paralleloptimizations that can made be on Intel's dual-core Atom N270 and4-core Atom 330, taking advantage of Hyper-Threading, parallelprogramming techniques, and SIMD instructions.

You will ascertain from Hillar's presentation that no matter howmuch parallelism you extract from your application, other factors suchas the processor and operating system play a key role in obtaining theultimate performance.

Furthermore, in the session “PerformanceConsiderations for Fine-Grained Parallel Applications on MulticoreArchitectures (ME710)”, Synopsys' Kalyan Saladi explains how thedifferences among the multicore architectures available today effectthe synchronization overhead associated with fine-grained parallelapplications.

These architectural differences are related to a wide variety ofdesign features such as memory interconnects, memory layout,cache-hierarchy, and CPU affinity schemes. Saladi explains theperformance variations on multicore systems in the context of barriersynchronization techniques and data sharing among threads.

Besides the architectural features associated with a specificmulticore processor, load balancing is also a key performanceconsideration to ensure proper utilization of the resources.

For the most part, it has been left to the application programmer tosolve the actual problem of how to partition, distribute, and schedulethe workload over available cores.

The session presented by Enea's Daniel Forsgren, “ExploringDynamic Load Balancing for Multicore AMP Systems (ME725)”, explorespotential solutions for load balancing on asymmetrical multiprocessing(AMP) systems and exemplifies the points with DSP baseband applicationsand how they can benefit from dynamic load balancing.

Inter-Process and Inter-CoreCommunication
In addition to load balancing, another important aspect of an AMPsystem is the integration of two or more distinct operating systems.Mentor Graphics' Stephen Olsen presents “Challengesof a Multi-OS Implementation on a Multicore System (ME793)” andexplains the many inherent challenges, such as resource management,scalability, and synchronization.

As the complexity of operation for multiple operating systemsincreases, the use of an amalgamated inter-process communication (IPC)becomes critical. But IPC is also very applicable for other types ofinter-core communications, be it with homogeneous or heterogeneousmulticore.

The session “Methodologyand Application Migration Flow for Multicore (ME728)” delivered bySven Brehmer of PolyCore Software, explains how to use a standardizedIPC methodology to migrate applications from single core to multicore,and between different types of multicore platforms.

Brehmer's session also covers the application analysis and strategyfor distributing an application across multiple cores to take advantageof functional pipelining and concurrency.

As a precursor for assigning tasks to specific cores, you shouldutilize a platform-agnostic, block diagram methodology to gain moreflexibility and scalability in your application. This methodologyfurther stresses the value of the communication design, which in turnbecomes an increasingly important performance factor as the number ofcores increases in a system.

Interestingly, this can make the design task comparable to theapproach used to design hardware systems: netlists, synthesis, andsimulation. Enno Wein of ProximusDA discusses this design approach inhis session entitled “WhyMulticore Software Development is Similar to Hardware Design (ME808)”,elaborating on how to use transaction-level-modeling (TLM) because itcleanly decomposes communication from the functions – even the”platform” can be decomposed and handled as an independent “dimension.”

As we've discussed, the software design aspect of inter-corecommunication has an important bearing on the performance andscalability of your system. However, the hardware aspect of theinter-core communication is equally as crucial to system performance.

High performance requirements, quality of service needs, andphysical design constraints make the integration of an increasingnumber of heterogeneous IP cores in a SoC a formidable challenge.

In the session “SoCIntegration of Heterogeneous IP using Network-on-Chips (ME807),”Arteris' Geert Rosseel describes how NoCs based on packetized datatransfers and flexible topology can address the current challenges inmulticore integration and improve overall system performance.

Circling back to the block-level approach for software design,another one of its benefits is that it can help reduce the amount ofdependencies in a program. Although it's a more common situation withsystems based on symmetrical multiprocessing (SMP) architectures, raceconditions are one by-product of data dependencies.

You might need to significantly re-architect your application toavoid these conditions, focusing on data flow and identifying criticalcode regions. The session “ControllingMulticore Race Conditions in Linux (ME701)”, presented by MikeAnderson of The PTR Group, elaborates on the problems and solutions forthread contention, priority preemption, and the ensuing race conditionsthat are prevalent when migration uniprocessor code to multi-coresystems.

Continuing with the analysis of thread contention, Boeing's TomDickens presents “ThreadSafety is Primary Concern on Multicore, Not Performance -or- Sure It'sFast, But Is It Correct? (ME699)”. Dickens' session details issuesinvolved with thread-safety on multicore systems, showing real-worldexamples of unsafe code (don't expectto see this code running in Boeing's Air Force One ). Thissession also shows various thread-safe fixes along with a detailedsummary that considers performance consequences of thread-safetechniques.

Resolving Multicore Debug Challenges
Assuming that you have inadvertently introduced thread safety issues orany other inappropriate issues into your multicore system, finding theresultant bugs is another area that requires close scrutiny. Ericsson'sDominique Toupin points out that many problems surface only when thecomplete hardware and software components are interacting under realloads.

In Toupin's session “MulticoreDebug/Trace: Recent Improvements and the Way Forward (ME689)”, hedescribes improvements such as low-overhead user space and kerneltracing with zero copy, safe for signal, thread and NMI, wait-freeread-copy update, cycle-level time-stamp, and non-blocking atomicoperations.

Samsung Electronics' Tasneem Brutch delivers another interestingdebugging session entitled “Testingand Debugging Concurrent Programs: A Case Study (ME752)”. Hersession describes the two-phase approach of active analysis tools thatare used to complement multi-threaded parallel software debugging.

In the first phase, static or dynamic code analysis identifiesconcurrency-related issues including, atomicity violations, data racesand deadlocks. The output of this phase is then provided as input tothe underlying scheduler, to control the thread scheduler, and tominimize false positives which may have been identified during thefirst phase.

Visions of Virtualization
Virtualization and hypervisors are other important topics extensivelyrepresented at the Multicore Expo. Although there are manycommercially-available hypervisors, few can be used to fulfill thereal-time requirements of embedded applications.

Hypervisor integrators face many limitations in terms of the size,hardware support, guest operating system support, scalability,communication, and performance overhead. Furthermore, all solutions areproprietary, challenging system designers with complex portabilityissues if there is a need to move from one virtualization solution toanother.

Alex Bachmutsky of Nokia Siemens Networks provides an analysis ofthe majority of embedded, soft real-time and hard real-time hypervisorsand their applicability for telecommunication applications.Bachmutsky's session, entitled “Virtualizationfor Network-Based Multicore Telecommunication Systems (ME686)”,also describes a benchmark approach from EEMBC that system developerscan use to analyze the overhead and latency associated withhypervisors.

To supplement the information provided in Bachmutsky's session,Cavium Networks' Rajan Goyal covers such topics as para-virtualizationvs. full virtualization, SoC virtualization, hardware assist orco-processor virtualization.

Goyal's session, entitled “AdvancedVirtualization Techniques for Multicore SoCs (ME786)”, alsoexplains how virtualization of multicore systems can provide isolationand protection between different domains, an increasingly importantarea for security in embedded systems.

Designing with multicore encompasses a wide variety of technologiesand techniques. The example sessions described in this articlerepresent a small portion of the material that presenters will deliverduring the Multicore Expo. These sessions will provide you with thebackground to make those critical design decisions while you areimplementing your multicore-based design.

(Markus Levy is chairman of the Multicore Expo. )

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.