Have you turned on your Parallella yet? - Embedded.com

Have you turned on your Parallella yet?

While at ESC Boston for an excellent conference, I had the pleasure of a few beers and dinner with Andreas Olofsson — the driving force behind the Epiphany devices and the Parallella board. While the Parallella board has seen success following its successful kick starter campaign, we did get around to wondering how many people had actually turned on their Parallella.

I received mine about 6 months ago and over those months, I have been experimenting with the board. To date I have learned the software development environment, how to control a single core, and then finally how to use multiple cores in work groups. You can read about my exploits here

Getting the Parallella up and running may seem complicated because you need to write both an application for the Host and an application to be offloaded onto the Epiphany. The reality is that writing your first program is pretty simple because the SDK provides both host and target libraries / APIs. In fact, you can even develop your programs on the Parallella itself.

I am currently using my Parallella to demonstrate concepts for a paper I am writing on how multicore coprocessors like the epiphany can be used to enhance high reliability applications. So let me give a brief overview of my thoughts on the matter to get the discussion flowing.

Within the world of high reliability software development, it is common to undertake one of three mitigation strategies to prevent errors when redundant hardware is not available:

  1. Repeat the same instruction three times and take a majority vote upon the results of the three instructions. Obviously this has a significant impact upon the performance of the system.
  2. Repeat the function multiple times — comparing the results returned by each function and again performing a majority vote upon the results. While the overhead of the majority vote is not as significant as for the first option, there is still a performance impact.
  3. Repeat the application multiple times — comparing the outputs of the application and again performing a majority vote on the output. This can be implemented with a hypervisor if required.

Regardless of the mitigation approach undertaken, each significantly impacts performance. Indeed, the best way to recover the performance traditionally is to introduce redundant systems and vote on the output. Obviously this has a significant impact on cost, weight, power, and space.

Here's where the epiphany offers a number of advantages. We can create a work group of four processors. Within this work group, three of the processors can execute the same program, storing the results within either the flat memory space or the shared memory (which can be ECC protected). We can then use the fourth processor to perform a majority vote and output the final result. If we want to be very paranoid we can even use a fifth processor to monitor the health of the four processes to provide further integrity. The smallest epiphany device lets us implement this architecture three times; in the largest, we can do this twelve times.

Of course implementing such structures does not address single point failures outside the scope of the processor domain, and there are other considerations that need to be addressed. Yet, it does provide increased performance benefits over traditional software mitigation methods. The paper I am currently writing will of course have more detail and analysis. I just need to find time to complete it.

So have you turned on your Parallella yet? If so, what are you doing with it?

Adam Taylor is a Chief Engineer – Electrical Systems at E2V, he was previously the Head of Electronic Design at Europe's leading Space company Astrium, where he had a dual role as Head Of Electronic Design and a Responsible Engineer leading product development. In this role he lead the development of the latest generation of space-based telecommunications processors based around the Virtex 5 QV and Deep Submicron ASIC technology. He has spent the last 13 years developing both hardware and FPGA solutions for telecommunications, cryptographic, radar, safety critical systems, and thermal imaging systems, among others. Having worked with reliable design techniques for many years, he is formalizing his experiences and knowledge in book that he is currently writing. He is a Chartered Engineer and a Fellow of the Institute of Engineering and Technology.

Embedded's Say What? blog offers a venue for individuals to express their opinion on topics of interest to the embedded community. If you'd like to have your say on what keeps the wheels of industry turning, contact Max Maxfield or Stephen Evanczuk.

4 thoughts on “Have you turned on your Parallella yet?

  1. “Does executing the same code multiple times really enhance reliability? This is assuming the errors come from “glitches” in the CPU processing which (AFAIK) is pretty unlikely.nnSurely more errors are due to problems with a specific body of code (eg.

    Log in to Reply
  2. “I don't think you're missing anything. In my work I have to deal with the notion of reliability through redundancy and the question always comes up – what's the failure mode that redundancy is attempting to mitigate? Mechanical component, electronic har

    Log in to Reply
  3. “Sorry I was just in flying back from Japan, the issue this relates to is flipped bits, due to Single Event Effects caused by radiation or alpha particles etc. This can flip the state of the register or be a transient and flip it momentarily. The SW techni

    Log in to Reply
  4. “Hi you are correct, the parallella enables you to do some interesting mitigation for SEE effects without as much of a hit on performance but you still need the system level reliability. This has to include all of the analysis to arrive at the correct FIT

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.