Programming Multi-core Architectures Using Data-Flow Techniques -

Programming Multi-core Architectures Using Data-Flow Techniques

Over the last five decades high-performance in microprocessor design was achieved by relying on improvements in fabrication technologies and architectural/organizational optimizations.

However, the most severe limitation of the sequential model, namely its inability to tolerate long latencies, has slowed down the performance gains, forcing the industry to hit the Memory wall.

As a result of this and other factors, such as the Power Wall and the diminishing returns of Instruction Level Parallelism, the entire industry had to switch to multiple cores per chip.

This ushered the beginning of the Concurrency Era, as it soon became evident that traditional programming models did not allow for efficient utilization of the large number of resources now available on a single chip.

The Data-Flow model is an alternative model that handles concurrency and tolerates memory and synchronization latencies efficiently. Moreover, the side-effect free semantics of Data-Flow exposes the maximum potential parallelism in programs by enforcing the minimum amount ofordering on execution (i.e. only enforce true data dependencies).

In this work we present a programming methodology based on the Data-Driven Multithreading (DDM) model of execution that combines Dynamic Data-Flow concurrency with efficient sequential execution.

In this paper we present a Multithreaded programming methodology for multi-core systems that utilizes DataFlow concurrency. The programmer augments the program with macros that define threads and their data dependencies.

The macros are expanded into calls to the run-time that creates and maintains the dependency graph of the threads and performs the scheduling of the threads using Data-Flow principles.

We demonstrate the programming methodology and discuss some of the issues and optimizations affecting the performance. A detailed evaluation is presented using two applications as case studies.

The evaluation shows that the two applications scale well and compare favorably with the results of similar systems. Our results demonstrate that Data-Flow concurrency can be efficiently implemented as a Virtual Machine on multi-core systems.

To download this external content, download the paper from the author archives.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.