When I was in the sixth grade, I was a member of my school's Safety Patrol. As children and adults alike increasingly place their lives in the hands of computer hardware and software, we need to add layers of safety there as well.
When I was in the sixth grade, I was a member of my school's Safety Patrol. It was my responsibility to ensure that younger children got on and off the school bus safely. “Safeties” wear bright orange sashes and help other kids cross streets adjacent to their bus stops. This is just one measure in a complex web of overlapping steps taken to protect the most vulnerable members of our communities.
As children and adults alike increasingly place their lives in the hands of computer hardware and software, we need to add layers of safety there as well. No software bug or hardware glitch (or combination) can ever be allowed to bring down an aircraft, whether hundreds of passengers are on board or just a pilot. The failure of many other systems must be similarly prevented. But software and hardware do fail-perhaps inevitably. As engineers, we use system partitioning, redundancy, protection mechanisms, and other techniques to contain and work around failures when they do occur.
As software's role in safety-critical systems continues to expand, I expect we'll see a rapid increase in the number of civil lawsuits filed against companies that design and manufacture embedded systems (adding several new levels of meaning to the phrase project post mortem). Indeed, anecdotal evidence suggests that lawsuits of this sort are already on the rise. With most of the action in hush-hush settlements outside the courtroom, though, the media hasn't yet noticed the trend.
One organization that has definitely taken notice of the hazards posed by software in products is Underwriter's Laboratories. An independent, not-for-profit product safety certification and ANSI-accredited standards organization, UL initiated a “Standard for Software in Programmable Components” in 1994. The resulting ANSI/UL-1998 standard addresses “the detailed safety-related characteristics of specific software in a product.”
Other organizations with standards in this area include the U.S. Department of Defense, IEEE, FDA, and ARINC. Many of these focus a significant amount of attention on the software development process. But issues of functional safety, or the device's ability to achieve or maintain a safe state in the face of failure, are also often addressed.
In addition to to these factors, it may also be beneficial to utilize an operating system that's been designed with safety-critical systems in mind. Above all else, an RTOS should not compromise the stability of the system. But an operating system can do much more than that, even helping to reduce the risks inherent in your application code. As you'll see in the article “Safety-Critical Operating Systems”, keeping software tasks from overwriting each other's data and stacks is merely the beginning.
Ultimately, the key to designing safety-critical systems is to include multiple layers of protection. The hardware, the operating system, and your application software must each do everything they can to prevent catastrophe-even if the fault itself lies elsewhere.