During the last few years, computer hardware manufacturers have moved decisively to multi-core chips. Instead of increasing clock frequency and boosting Instruction Level Parallelism (ILP) for a single chip, processor engineers are adding more and more cores in a single chip in order to increase the performance per watt ratio.
This trend also embraced embedded systems. In fact, the increasing processing demands placed by embedded applications together with the need to reduce power consumption, have opened the route to the wide adoption of high-density multi-core architectures also in embedded systems.
Multi-core embedded architectures have shown the possibility to obtain a good balance between high-performance and low power requirements, but most applications do not exploit fully the potential of these architectures.
This situation is in part due to the limited availability of parallel software, compiler technology and development tools, and in part to the fact that multi-core programming is still perceived as a niche of the high-performance-computing area, hence reserved to high specialized software engineers.
The scarcity of good high-level programming tools and environments, forces application developers to rewrite the sequential programs into parallel software taking into account all the low level features and peculiarities of the underlying platforms.
In this paper, we propose a compilation tool-chain, devised within the SMECY project supporting the effective exploitation of multi-core architectures.
The tool-chain leverages on both the application requirements and the platform-specific features to provide developers with a powerful parallel programming environment able to generate efficient parallel code without significant effort for the programmer. The tool-chain proposed is well-suited for applications that compute continuous streams of data, as for example, the ones in digital signal processing.
The programmer, graphically decomposes the application for the target platform using Thales' Spear Design Environment (SpearDE), which offers a graphical interface for describing computational kernels interacting in a dataflow application.
Parallelism is explicitly exposed to the programmer, who has to decide a suitable decomposition of the application into modules as well as a mapping of each module onto elements of a graphical model of the target platform. This way, the programmer is able to transfer to back-end tools all information needed in order to produce efficient target code in terms of performance and power consumption. The resulting parallelization is generated under the form of an Intermediate Representation (IR).
This design choice enhances tool-chain portability giving the possibility to clearly decouple front-end and back-end tools. In the proposed tool-chain, the back-end tool is HPC Project's Par4All. Starting from the IR generated by SpearDE, Par4All is able to generate efficient target code both in terms of performance and power consumption for a set of multi- and many-cores platforms.
***Co-authors of this paper are M. Amini, S. Guelton, R. Keryell, V. Lanore, and F.X. Pasquier, HPC Project, Paris, France; M. Barreteau, R. Barrere, T. Petrisor, E. Lenormand, C. Cantini, of Thales Research and ; C. Cantini and F. De Stefani, SELEX Sistemi Integrati.
To read this external content in full download the paper from the author archives online at the University of Pisa.