Ensuring software timing behavior in critical multicore-based embedded systems - Embedded.com

Getting somewhere safely depends on more than just good brakes, working taillights, and someone with excellent reflexes behind the wheel. Increasingly, the components that keep your car on the road and your plane in the air are not only human, or even just mechanical. They are sophisticated pieces of embedded software running on complex heterogeneous multicore processors, controlling everything from flight management system to power steering, and executing to strict timing deadlines measured in microseconds.

Herein lies the challenge. The timing behavior of software in a multicore system is affected not only by the software running on it and its inputs, but also by the other software running on other cores.

Critical embedded systems require an immense effort and investment (millions of euros/dollars and years of engineering effort) to be developed. Safety has to be at the heart of the architecture and design, right from the earliest stages of the software development process. In particular, systems designers must understand the timing behavior of their software, to ensure it can execute within safe timeframes.

Solving the multicore timing analysis (MTA) puzzle

Although the awesome computing capacity of a multicore processor should (in theory) make embedded systems more powerful and efficient, software executing on one core can slow down execution of software running on the other cores. In this situation, software can take longer to execute due to interference caused by contention for shared resources such as buses, memory, caches, devices, FPGAs and GPUs that are shared with tasks running on other cores.

How do you quantify the effects of this interference? How do you analyze, test, and provide concrete evidence that your safety-critical software, when running on a multicore platform, can always execute within its timing deadlines?

Experts at the Barcelona Supercomputing Center (BSC), Rapita Systems Ltd (RPT), Raytheon Technologies (RTRC), and Marelli Europe (MAR) have been investigating answers to these questions for many years. BSC and Rapita have been developing a solution that will soon be rolled out across the aerospace and automotive industries. Specialized tooling and automation, combined with a requirements-based, safety-focused methodology were the keys to solving the puzzle.

This work has formed the basis of the MASTECS project, a multi-disciplinary research and development project funded by the European Commission and launched in December 2019. The MASTECS project will mature the technologies and support their use for certification of avionics and automotive systems. A key part of the MASTECS project is to provide a demonstration of the approach in two industries through case studies deployed by RTRC and MAR.

State of the art tools

Commercially available tools to support timing analysis are effective for simple (single-core) electronics, but do not scale to meet novel multicore-specific certification requirements and recommendations.

  • Static timing analysis solutions [1] face a complexity wall and can neither effectively model the increasingly complex hardware nor efficiently deal with the structural and syntactical characteristics of exceptionally complex software functionalities.
  • Measurement-based solutions have reached a good level of penetration in the single-core analysis market (Rapita Systems’ RVS toolset being amongst the most successful ones). However, such tools are still unable to fully sustain the challenges brought by the introduction of multicores. They typically focus on measurement scenarios as determined by consolidated functional testing strategies, but lack a hardware expertise based methodology that helps deriving trustworthy timing bounds for tasks running in multicore with the necessary supportive evidence and an adequate level of traceability.

To our knowledge, no commercial tool is available in the market, other than the one being matured in MASTECS, that is capable of analyzing the timing of software on multicore platforms, with strong focus on applicable safety standards and emerging certification requirements.

Interference analysis and control in action

The key to understanding interference is a structured test methodology, using hardware and software experts to produce evidence about multicore timing behavior. A specialized technology from BSC (known as multicore micro-benchmark technology or MμBT, commercialized by Rapita as RapiDaemons) lets system designers analyze and quantify the effects of interference in a multicore-based application by creating additional interference scenarios to stress-test different parts of the multicore processor.

Micro-benchmarks, at the heart of MuBT, are well-crafted pieces of code that operate at the lowest interface between hardware and software to stress a specific shared resource. Micro-benchmarks expose the impact of interference channels on software timing. To do so, micro-benchmarks can be deployed to cause a configurable and quantifiable pressure on a specific application. Micro-benchmarks are specifically designed to exhibit a single, clearly defined behavior with anticipated effect on a specific hardware resource, while preventing as much as possible to generate contention on other interference channels. Micro-benchmark key features include the following:

  1. They put quantifiable pressure on specific shared resource.
  2. Their behavior can be verified via event monitors.
  3. They capture specific timing-related requirements, e.g., whether the mitigation actions you put in place to master contention are effective.

click for larger image

Figure 1: Use of micro-benchmarks in interference analysis. (Source: Authors)

A wide range of micro-benchmarks have been developed to have specific roles, including matching a desired level of interference, maximizing interference on the resource, or simply being very sensitive to contention (‘victims’).

In analyzing the effects of interference, the use of MμBT is supported with a task contention model (TCM) that provides early estimates of the contention delay tasks can suffer. Software automation and testing tools RapiTest and RapiTime developed by Rapita are used to write tests and run them on the embedded target.

Design methodology

By following a seven-step test design process along the standard software ‘V’ development process (Figure 2), engineers can more fully understand the impact of interference.

  1. Multicore processor critical configuration setting, interference channel and event monitor analysis. Hardware experts help identify critical configuration settings to set the framework in which interference channels are also identified along with mitigation measures. The identification of hardware event monitors is also instrumental to provide a means of verification for all following steps.
  2. Identify timing requirements. Help the end user to identify their specific needs, timing requirements, risks and safety issues for the system. For example, verify the performance of any hardware isolation approach to minimize interference.
  3. Test case design. Develop specific test cases (description of a test) to verify the set of hypotheses supporting the user requirements, including defining the MμBT items that will be required to provide evidence in the interference channel analysis. This involves execution in isolation (no interference), execution against micro-benchmarks to assess application’s execution time and hardware sensitivity to interference under different quantifiable stress scenarios.
  4. Implementation of test procedures. Currently, a manual process to be automated in MASTECS, this step builds the test procedures consisting of a test framework, micro-benchmarks and measurement probes to record/trace the results.
  5. Evidence gathering (testing). The test procedures are executed on the platform to gather test evidence. Currently involving some manual work, this will be automated in MASTECS using the RapiTest automation framework to execute those tests and link them back to verification requirements.
  6. Results Analysis. A review of the test results by technical experts to check how the test results verify (or otherwise) the verification requirements. For example, Figure 3 shows a screenshot of RapiTime on the execution times reported for different functions of a program.
  7. Validate results and generate documentation. Final review of requirements, generation of documentation and qualification results to support the safety argument of the system. The customer can use the full set of reports and analysis artefacts directly for the certification of software running on multicore.

click for larger image

Figure 2: MTA steps in the V-model software development process. (Source: Authors)

Hardware expertise and the timing analysis process

Injecting hardware (multicore) expertise is a key trait in the proposed MTA approach for its success on modern complex multicores.  During early software development stages:

  1. Hardware experts identify multicore configurations (critical configuration settings in avionics jargon) as they play a key role in determining the software functional and timing behavior, and largely affect the amount of contention tasks generate each other. As an illustrative example, current processors implement isolation and segregation mechanisms that, if properly deployed, can heavily reduce contention.
  2. Multicore experts play a key role in identifying those resources in which task contention can arise (these are referred to as interference channels in avionics). The ability of hardware experts to navigate multi-thousand-page processor technical reference manuals and formulate the appropriate questions on the potential missing information on the manuals to the chip vendors is fundamental to drive an appropriate MTA process.
  3. Once interference channels are identified, hardware experts identify those event monitors that can be used to track the activity which tasks generate on those interference channels as a proxy metric to bound the contention that tasks can suffer. The correctness of those event monitors must also be verified [2] for which a specialized set of micro-benchmarks has been designed.
  4. Finally, hardware experts work hand in hand with timing analysis experts to derive, from user requirements, high-level and low-level requirements and specific tests to validate the hypotheses supporting the user requirements. Each test instantiates one or several micro-benchmark programs designed by hardware experts to put the desired level of load on the target (set of) interference channel(s).

During late design stages:

  1. Hardware experts contribute with the analysis of test results to assess whether they confirm or reject hypotheses.
  2. Hardware experts also contribute to establishing new hypotheses and the corresponding tests in case they are needed based on the results obtained in the previous step.

click for larger image

Figure 3: Analyzing results (RapiTime). (Source: Authors)

The bigger picture

The 7-step test design process is only one part of a wider multicore verification methodology shown earlier in Figure 2. This methodology, which will continue to be matured as part of the MASTECS project, is designed to achieve full traceability, from comprehensive evidence and results back to the corresponding requirements and designs.  The methodology is designed to meet the objectives defined in CAST-32A, the key guidance document issued by aerospace certification authorities. It is also specifically aligned with ISO 26262, the safety standard for the automotive sector, which advocates freedom from interference.

CAST-32A was published by the Certification Authorities Software Team (CAST) in 2016, and identifies factors that impact the safety, performance and integrity of airborne software systems executing on multicore processors. If you want to use multicore hardware in an avionics system, this is the go-to document. It provides objectives intended to guide the production of safe multicore avionics systems including objectives related to identifying and bounding the impact of interference channels. View the CAST-32A position paper here. EASA and FAA are working on an adaptation of the multicore generic CRI into a common AMC/AC material (AMC 20-193). It is expected to be published “later this year”[3].

Expertise cannot be automated

Interference effects are complex. To unravel their mysteries, you need experts who understand both the components of the multicore architecture, and the scheduling and resource allocation systems in the software. Collaboration between hardware and software experts will be a central feature of the MASTECS project as it continues into the future. But while collaboration leads to great strides in software tooling and automation, it’s important to remember that you can’t automate every step of a validation process – especially not when multicore timing analysis is involved.

You need experienced engineers who know the systems in detail. For example, during the early stages, multicore experts can identify the processor configurations (also known as hardware critical configuration settings) that determine the software’s functional and timing behavior, as well as the potential interference channels. When it comes to analyzing test results, nothing beats the input of an experienced human expert to revisit and evaluate the original assumptions made about the platform, and use their knowledge to feed into a new testing cycle.

References

[1] Reinhard Wilhelm. Mixed Feelings about Mixed Criticality. Workshop on Worst-Case Execution Time Analysis, 2018.

[2] Enrico Mezzetti, Leonidas Kosmidis, Jaume Abella, Francisco J. Cazorla. High-Integrity Performance Monitoring Units in Automotive Chips for Reliable Timing V&V. IEEE Micro 38(1): 56-65 (2018).

[3] https://www.aviationtoday.com/2020/02/28/easa-and-faa-to-issue-further-guidance-on-multicore-certification-this-year/


Dr Francisco J. Cazorla (BSC) is the leader of the Computer Architecture / CAOS group in the BSC and the technical coordinator of MASTECS. His main area of research is multicore processors for embedded critical systems where he has led several EU-funded and private-funded research projects and has a long track record of research publications.
Dr. Enrico Mezzetti (BSC) is a senior researcher in the Computer Architecture / CAOS group. His research is mainly focused on industrial-quality techniques for the timing verification of embedded real-time systems. On the same topic, he has been contributing to several EU and collaborative industrial projects.
Dr. Ian Broster (RPT) is a founder and the General Manager of Rapita Systems Ltd, He earned his PhD in 2003 at the Real-time Systems group of University of York for work on the timing analysis of real-time communication and has been involved in significant real-time research projects involving scheduling analysis, predictable multi-core, real-time communication, fault tolerance, testing and verification. His focus today is the transfer of research technologies to practical industrial uses in the domain of reliable embedded systems.
Dr. Juan Valverde (RTRC) is a Staff Research Scientist at United Technologies Research Centre Ireland Ltd, a position he has held since 2015. At UTRC Ireland, Valverde works as part of the Networks and Embedded Systems group focused on Heterogeneous Computing, Real-Time Systems, IoT, and Hardware Verification activities. Prior to UTC, Valverde was a Staff Researcher at the Centre of Industrial Electronics in Madrid where he collaborated in several research projects (EU and Spain national research programs and company funded programs). Valverde has published 17 peer-reviewed technical articles and 1 book primarily in the fields of Reconfigurable Computing and Edge Computing for IoT. Valverde holds a Ph.D. in Microelectronics from the Technical University of Madrid (Spain).
Stefania Botta (MAR) has a degree in Computer Science, inside Marelli – Powertrain B.U. she is part of “Software Tools and Methodologies” team. Her main activity is focused on System and Software Architecture Design and Model Based Design definition, she also holds the role of Process Quality System Representative of Product and Process Development.
Dr. Jaume Abella (BSC) is a senior researcher at BSC, leading activities on hardware design, hardware and software testing, and statistical analysis for safety-relevant multicore systems. Jaume’s track record includes transferring technology to several commercial products, serving as BSC’s PI in several EU-funded projects, coadvising several PhD and master students, and publishing 100+ research articles.
Dr. Christos Evripidou (RPT) is the Technical Lead of Rapita Systems’ UK Multicore Timing Analysis team. He earned his EngD (Doctor of Engineering in Large-Scale Complex IT Systems) in 2018 at the University of York for his work on scheduling for mixed-criticality hypervisor systems in the automotive domain. Christos is actively working on the refinement of tooling and processes for performing timing analysis, satisfying DO-178C and CAST-32A objectives.
Dr. Javier Mora de Sambricio (RTRC) is a Senior Research Scientist at United Technologies Research Centre Ireland Ltd, a position he has held since 2018. At UTRC Ireland, Mora works as part of the Networks and Embedded Systems group focused on Heterogeneous Computing, Real-Time Systems, and Digital Hardware Design activities. Prior to UTC, Mora was a Ph.D. student and FPI grant holder at the Centre of Industrial Electronics in Madrid where he collaborated in several research projects. Mora has published 3 journal articles and multiple conference papers in the fields of parallel processing architectures and reconfigurable hardware. Mora holds a Ph.D. in Industrial Electronics from the Technical University of Madrid (Spain).

About MASTECS project partners 

Barcelona Supercomputing Center (BSC) is the National Supercomputing Facility in Spain. The mission of the BSC is to research, develop and manage information technologies in order to facilitate scientific progress. The BSC not only strives to become a first-class research center in supercomputing, but also in scientific fields that demand high performance computing. BSC brings to MASTECS its +15 year of expertise of multicore hardware expertise in critical embedded domains along with its Multicore micro(μ)Benchmark Technology developed for more than 10 years. MμBT exploits a set of engineered benchmarks for stress-testing applications in multicore.  MµBT will be enhanced during MASTECS by adding formal documentation, enhancing the library to support more interference channels and processors, and building formal test cases for certification. BSC has also created a spin-off company, Maspatechnologies, that enables unrestricted commercial exploitation of this technology, hence providing a solid exploitation path for the MµBT and hence the MTA technology.

Rapita Systems Ltd (RPT) is an industrial leader in aerospace and automotive software verification. Renowned for its customer service and innovative high-quality tools for software testing, Rapita has a strong history of bringing new high-tech research to industrial and commercial success. Founded in 2004 as a spin-off from a E.C. FP5 project, Rapita has grown to around 45 staff in York, UK. Acquired by Danlaw Inc in 2016, Rapita continues in its mission to serve the aero/auto safety-critical markets with the latest technology. Rapita’s flagship product, RVS (consisting of a set of tools as shown below), is now widely used across the aerospace market.

Marelli Europe (MAR) Founded in the 1900’s, Magneti Marelli became known as a pioneer within the motor industry for its contribution towards smart and sustainable mobility. During its 100-year history, it served customers from its base in Italy, growing operations across Europe, North and South America, India and China to become a leading player in the field of Lighting, Electronics, Powertrain and Motorsport. In 2019, MARELLI was officially formed. The union of two industrial giants as, Magneti Marelli and Calsonic Kansei, was recognized as bringing together outstanding industrial expertise and unique heritage. The new Marelli is a key supplier of advanced automotive components, the focus in MASTECS will be on a Powertrain Electronic Control Unit based on the AURIXTM TricoreTM TC397 multicore microcontroller managed by a Real-Time Embedded Software developed in compliance with AUTOSARTM architecture and ISO26262 Safety requirements.

Raytheon Technologies Research Centre (RTRC) is the central research organization of Raytheon Technologies (RTX), a $70B multi-national corporation with multiple operating units focused on serving the aerospace and Defence industry. The main business units of Raytheon Technologies are Collins Aerospace, Pratt and Whitney, Intelligence and Space, and Missiles and Defence RTRC’s researchers focus on the big picture, investigating the “what ifs” of power distribution, advanced propulsion, energy storage, engine efficiency, autonomy and human-machine interaction, in innovative aircraft systems.

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.