Reliable and power-aware architectures: Microbenchmark generation

Editor's Note: Embedded designers must contend with a host of challenges in creating systems for harsh environments. Harsh environments present unique characteristics not only in terms of temperature extremes but also in areas including availability, security, very limited power budget, and more. In Rugged Embedded Systems, the authors present a series of papers by experts in each of the areas that can present unusually demanding requirements. In Chapter 2 of the book, the authors address fundamental concerns in reliability and system resiliency. This series excerpts that chapter in a series of installments including:
– Reliable and power-aware architectures: Sustaining system resiliency

Reliable and power-aware architectures: Measuring resiliency 
 
Reliable and power-aware architectures: Soft-error vulnerabilities 
Reliable and power-aware architectures: Microbenchmark generation (this article)
Reliable and power-aware architectures: Measurement and modeling

Elsevier is offering this and other engineering books at a 30% discount. To use this discount, click here and use code ENGIN317 during checkout.

Adapted from Rugged Embedded Systems, Computing in Harsh Environments, by Augusto Vega. Pradip Bose, Alper Buyuktosunoglu.

CHAPTER 2. Reliable and power-aware architectures: Fundamentals and modeling (Continued)

7 MICROBENCHMARK GENERATION

A systematic set of microbenchmarks is needed to serve as specific stressers in order to sensitize a targeted processor chip to targeted failure modes under harsh environ- ments. The idea is to run such specialized microbenchmarks and observe the onset of vulnerabilities to robust operation. Microbenchmarks are targeted at the processor so that deficiencies can be identified or diagnosed in:

  1. traditional architecture performance (e.g., instructions per cycle, cache misses);

  2. power or energy related metrics;

  3. temperature and lifetime reliability metrics;

  4. resilience under transient errors induced by high energy particle strikes, voltage noise events, and thermal hot spots; etc.

Microbenchmarks can certainly be developed manually with detailed knowledge of the target architecture. However, from a productivity viewpoint there is a clear value in automating the generation of microbenchmarks through a framework. A microbe- nchmark generation framework that allows developers to quickly convert an idea of a microbenchmark into a working benchmark that performs the desired action is nec- essary. The following sections will take a deep dive into a microbenchmark gener- ation process and capabilities of an automated framework. We will also present a simple example to make concepts clear.

7.1 OVERVIEW

In a microbenchmark generation framework, flexibility and generality are main design constraints since the nature of situations on which microbenchmarks can be useful is huge. We want the user to fully control the code being generated at assembly level. In addition, we want the user to specify high level (e.g., loads per instruction) or dynamic properties (e.g., instructions per cycle ratio) that the microbenchmark should have. Moreover, we want the user to quickly search the design space, when looking for a solution with the specifications.

In a microbenchmark generation process, microbenchmarks can have a set of properties which are static and dynamic. Microbenchmarks that fulfill a set of static properties can be directly generated since the static properties do not depend on the environment in which the microbenchmark is deployed. These types of properties include instruction distribution, code and data footprint, dependency distance, branch patterns, and data access patterns. In contrast, the generation of microbenchmarks with a given set of dynamic properties is a complex task. The dynamic properties are directly affected by the static microbenchmark properties as well as the architecture on which the microbenchmark is run. Exam- ples of dynamic properties include instructions per cycle, memory hit/miss ratios, power, or temperature.

In general, it is hard to statically ensure dynamic properties of a microbenchmark. However, in some situations, using a deep knowledge of the underlying architecture and assuming a constrained execution environment, one can statically ensure the dynamic properties. Otherwise, to check whether dynamic properties are satisfied, simulation on a simulator or measurement on a real setup is needed. In that scenario, since the user can only control the static properties of a microbenchmark, a search for the design space is needed to find a solution.

Fig. 9 shows the high level picture of a microbenchmark generation process. In step number one, the user provides a set of properties. In the second step, if the prop- erties are abstract (e.g., integer unit at 70% utilization), they are translated into archi- tectural properties using the property driver. If the properties are not abstract, they can be directly forwarded to the next step, which is the microbenchmark synthesizer. The synthesizer takes the properties and generates an abstract representation of the microbenchmark with the properties that can be statically defined. Other parameters that are required to generate the microbenchmarks are assigned by using the models implemented in the architecture back-end. In this step, the call flow graph and basic blocks are created. Assignment of instructions, dependencies, memory patterns, and branch patterns are performed. The “architecture back-end” consists of three com- ponents: (a) definition of the instruction set architecture via op code syntax table, as well as the high-level parametric definition of the processor microarchitecture; (b) an analytical reference model that is able to calculate (precisely or within spec- ified bounds) the performance and unit-level utilization levels of a candidate loop microbenchmark; (c) a (micro)architecture translator segment that is responsible for final (micro)architecture-specific consolidation and integration of the microbe- nchmark program. Finally, in the fourth step the property evaluator checks whether the microbenchmark fulfills the required properties. For that purpose, the framework can rely on a simulator, real execution on a machine or analytical models provided by the architecture back-end. If the microbenchmark fulfills the target properties, the final code is generated (step 6). Otherwise, the property evaluator modifies the input parameters of the code generator and an iterative process is followed until the search process finds the desired solution (step 5).


FIG. 9 High-level description of a microbenchmark generation process.

These steps outline the workflow for a general use case. However, different cases require different features and steps. Moreover, for the cases where the user requires full-control of the code being generated, the framework can provide an application programming interface (API). The API allows the user to access all the abstractions and control steps defined in the workflow. Overall, the microbenchmark generation framework provides for an iterative generate and test methodology that is able to quickly zoom in on an acceptable microbenchmark that meets the specified requirements.

7.2 EXAMPLE OF A MICROBENCHMARK GENERATION PROCESS

Figs. 10 and 11 show a sample script that generates 20 random microbenchmarks and a sample of the generated code, respectively. The script in Fig. 10 highlights the modularity and flexibility of a microbenchmark generation framework. The script uses the basic abstractions provided by the framework to specify the process that needs to be done in order to generate the microbenchmarks. The code wrapper object is in charge of implementing the code abstraction layer. This abstraction layer provides a language independent API to dump microbenchmarks, as well as an abstraction level in order to implement different code layouts. Instances of this class control the declaration of variables and functions, as well as the pro- logues and epilogues before and after the microbenchmark control flow graph. Moreover, it also controls how the instructions are generated. In our example, the wrapper that is gathered in line 9 of Fig. 10 which is passed to the synthesizer in line 15, is in charge of variable and function declaration (lines 1–10), control flow graph prologue, control flow graph epilogue, and function finalization. We can see that this wrapper generates C code featuring a main function with an end- less loop using PowerPC instructions. This is the reason why it is named as CInfPpc (Fig. 10, line 9). The microbenchmark synthesizer is designed in a multi- pass fashion, which means that synthesizer processes the benchmark representa- tion several times. Each pass, represented by an instance of the pass class, takes the result of the previous pass as the input, and creates an intermediate output. In this way, the (intermediate) benchmark is modified by multiple passes until the final pass generates the final code. This can be observed in Fig. 10. In line 15, a synthesizer object is created and in lines 20, 23, 26, 29, 32, and 35 several passes are added. Finally, in line 39, the benchmark is synthesized and then it is saved to a file (line 42). Note that this design permits users to specify the transformations (passes), as well as the order in which they are performed. This ensures the flex- ibility of the framework.


FIG. 10 Sample script for random microbenchmark generation.

The next installment from this chapter discusses methods for power and performance measurement and modeling.

Reprinted with permission from Elsevier/Morgan Kaufmann, Copyright © 2016

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.