As the quantity of industrial equipment controlled by electronics grows, so do concerns over the equipment failing and causing personal harm and property damage. Safety functions are built into equipment to prevent functional failure and ensure that if a system does fail, it fails in a nonharmful way. Examples of safety systems in industrial equipment include train brakes, sensors monitoring hazards to air quality or the physical environment, assembly line assistance robots, and distributed control in process automation equipment, just to name a few. These systems often include field programmable gate arrays (FPGAs) that, when supported by safety data packages for calculating failure rates, can play a pivotal role in streamlining safety assessments. When these devices are also flash-based and therefore immune to single event upsets (SEUs), FPGAs enable safety system developers to dramatically simplify their designs.
Standards and Regulations
Concerns over functional safety spawned the growth of functional safety standards and regulations. In 1998, the International Electrotechnical Commission (IEC) published “IEC 61508 Functional Safety of Electrical/Electronic/Programmable Electronic (E/E/PE) Systems,” establishing the global functional safety standard for industrial equipment. While it is impossible to produce electronic equipment that never fails, it is possible to take steps to reduce the risk of a harmful failure. The idea behind the safety standard is to lower the probability of a harmful failure in the system to an acceptable level. To do this, functional safety standards place requirements on the design, processes and techniques, and measures of developing the equipment under control (EUC) and their control systems. To be compliant with IEC 61508, functional safety systems must not exceed the failure rates set by the standard for the safety integrity level (SIL) of the system.
While it is possible for a company to proclaim its product meets IEC 61508 requirements, many customers require certification by a neutral, third party such as Germany’s TÜV Rheinland Group, a leading provider of technical services worldwide. Founded in 1872 and headquartered in Cologne, Germany, the group’s mission and guiding principle is to achieve sustained development of safety and quality in order to meet the challenges arising from the interaction between humans, technology and the environment. Today TÜV enjoys the highest reputation in certifying IEC 61508 devices and software design tools. This certification group ensures the highest acceptance of devices by machine builder and plant owners worldwide.
Functional Safety System Life Cycle and V-Model
To meet the requirements in IEC 61508, the safety lifecycle illustrated in Figure 1 must be rigorously followed. EUC and the EUC control system are typically composed of multiple subsystems with many standard and customized parts. The system, including safety requirements, are defined during early development and then flowed down to the various subsystems.
click for larger image
Figure 1. Safety Lifecycle (Source: Microsemi)
As the requirements and architecture are decomposed into smaller and smaller and smaller units, verification and validation plans are created for use in the corresponding stage in the buildup of the system. This is a V-model of development and it is required by IEC 61508. Figure 2 shows a generic V-model.
click for larger image
Figure 2. Generic V-model (Source: Microsemi)
One or more of these parts in the functional safety system may be an FPGA. An FPGA that is to be part of an IEC 61508 qualified system is also required to have followed a V-model of development. One example is the Microsemi FPGA V-model, which is qualified by TÜV Rheinland as meeting the development life cycle requirements of IEC 61508.
High Integrity, Self-Test and Fail-Safe States
For a system to be safe, however, software alone cannot provide safety assurance, as its correct execution is dependent upon the system hardware. Similarly, hardware alone cannot satisfy the safety requirements. Functional safety designs are not simply created by a great design—there are many factors which influence the correct operation of a system including failure rates, usage, production and programming. The functional safety requirements of a system cover all aspects of components, including both software and hardware, which have high integrity, self-test mechanisms and a fail-safe state. In the end, the designers will reach an average certainty level of failure probability for a design, which is categorized by its SIL. Most common for industrial design is SIL3, which specifies one failure in a minimum of 11,000 years. System architects as well as hardware and software engineers will prove the integrity of a system, implement the system’s self-test and define the fail-safe states to comply with this SIL level.
To assess the integrity of a system, designers must consider several factors such as FIT (failures in time) rates, particularly failed elements which could cause failures and common cause factors such as a power supply failure and take down an entire system. To assist designers in calculating these rates, FPGA manufacturers have safety data packages which often provide “proven-in-use” calculations. These calculations are based on actual operating hours of their FPGAs in use. In addition, several software components are also provided proven-in-use calculations.
One area in which flash-based FPGAs have a major advantage over SRAM FPGAs is their configuration memory is immune to SEUs. Only FPGAs offering flash configuration of the logic elements (LEs) will be immune. FPGAs which have flash on the side and load an SRAM at power up will still have SEU effects. Table 1 compares the failure rates of SRAM-based FPGAs to flash-based alternatives, which enable system designers to reduce the failure rates because the SEU for these devices is zero.
click for larger image
Table 1. Functional Failure Rates Based on Studies by Microsemi Corp. and Third Parties (Source: Microsemi)
Self-tests must be implemented into a system to detect systematic or stochastic failures such as a bit flip. Typically, every component is analyzed to see how far an element failure influences its behavior. These self-tests deliver a probability wherein a failure will be detected. Memory tests are checked for potential failure points such as stuck memory cells (static failures) and bit flips (stochastic failures). The system must implement the correct test strategy, as well as the respective reactive measures in this safety concept.
All FPGAs can have register flips, but additional efforts are required for SRAM-based devices. Their SEU-prone fabric can cause routing failures or logic failures, which requires designers to implement triple modular redundancy (TMR)-based designs. Safety-sensitive logic requires triplication with majority-voter logic at outputs. When flash-based FPGAs are used, this TMR logic is often not required because of the SEU immune fabric. Implementing TMR in an SRAM FPGA increases the logic used for a design as well as the power consumption, as additional logic is required to run in the design.
Example of a TMR Implementation
click for larger image
Figure 3. Example of a TMR circuit (Source: Microsemi)
In case of failure or if safety-related data is corrupted, the system must change to a safe state. It is simplest to define the safe state as power off, but for many systems down time is not tolerated. If a power supply has a brownout condition, it is likely an SRAM FPGA would be affected by this. Often this requires an entire restart of the system. Here again a flash-based FPGA has the benefit of its non-volatile configuration and it is instant on. This capability can enable a system to respond more quickly to a failure and could potentially allow the system to support a slower safe state instead of a full power off. The instant on of the flash-based FPGA may also enable lower power modes to be supported. A system could run in a lower power idle mode, and when needed the FPGA could quickly power up and initialize the system for full operation.
Designing functional safety systems mandates particular attention to detail and step-by-step design flows. With flash-based FPGAs now certified to IEC61508, engineers can rely on this technology and supporting software and safety data packages to streamline safety assessments. By delivering a unique combination of SEU immunity and instant-on capability while eliminating the need for TMR logic, flash-based FPGAs are a compelling choice for developing greatly simplified functional safety designs.
Ted Marena is the director of FPGA/SOC marketing at Microsemi. He has over 20 years’ experience in FPGAs. Previously Marena has held roles in business development, product & strategic marketing. He was awarded Innovator of the Year in February 2014 when he worked for Lattice Semiconductor. Prior to joining Lattice Semiconductor, Marena spent nearly four years as a hardware design engineer at Wang Computers, designing boards for Ethernet, ISDN, RS232 and T1/E1, where he honed skills necessary to effectively drive SOC & FPGAs. Marena holds a Bachelor of Science in electrical engineering Magna Cum Laude from the University of Connecticut and a MBA from Bentley College’s Elkin B. McCallum Graduate School of Business.