Software Standards Compliance 101: Implementing a programming standard
In the mid-1990s, a formal investigation was conducted into a series of fatal accidents with the Therac-25 radiotherapy machine. Led by Nancy Leveson of the University of Washington, the investigation resulted in a set of recommendations on how to create safety-critical software solutions in an objective manner. Since then, industries as disparate as aerospace, automotive and industrial control have encapsulated the practices and processes for creating safety- and/or security-critical systems in an objective manner into industry standards.
Although subtly different in wording and emphasis, the standards across industries follow a similar approach to ensuring the development of safe and/or secure systems. This common approach includes ten phases:
- Perform a system safety or security assessment
- Determine a target system failure rate
- Use the system target failure rate to determine the appropriate level of development rigor
- Use a formal requirements capture process
- Create software that adheres to an appropriate coding standard
- Trace all code back to their source requirements
- Develop all software and system test cases based on requirements
- Trace test cases to requirements
- Use coverage analysis to assess test completeness against both requirements and code
- For certification, collect and collate the process artifacts required to demonstrate that an appropriate level of rigor has been maintained.
Phase 5 is discussed in the main body of this article. A key topic in creating safe and secure systems is building quality and security into software. This is done by adopting an enforceable coding standard that has the additional benefit of greatly reducing the cost of development, especially when it comes to meeting corporate and/or industry software quality, safety and security requirements.
Incorporating appropriate coding standards into software development serves to reduce defects by building quality into the software as it is being written. The challenge in identifying a solution hinges on the fact that quality means different things in different applications. It’s the consequences or impact of software failure that defines the level of quality needed. For example, the huge settlement Toyota recently made in response to the “unintended acceleration” lawsuit that was decided against it came because Toyota’s software failure resulted in multiple deaths. In contrast, a software security failure typically only results in lost money. Cloudfare recently announced that the remediation costs associated with the Open SSL “Heartbleed” vulnerability cost them many millions of dollars.
Companies adopt and apply coding standards to mitigate these types of software defect as the software is being written. Coding standards encapsulate a governing organization’s wisdom and years of expertise and experience programming in a given language for a given domain and provide guidance for the creation of good code.
At their most basic, programming standards help to ensure the consistency of new code, and that the code output from one developer can be read by any other developer in the organization, facilitating code reviews and downstream maintenance. This is especially beneficial during the defect identification and isolation process, assisting in the reduction of latent defects and therefore lowering overall software costs. Programming standards can also help organizations avoid areas of significant programming risk by improving the overall quality of the code and avoiding the “undefined behaviors” associated with certain language semantics.
How Do the Errors Get Started?
One of the most revered and widely used programming languages in use today, the C programming language was originally designed as a lightweight language with a small footprint. However, C’s flexibility carries some inherent risks. For instance, multiple versions of an international C language standard have been ratified, with the latest in 2011 (C11). All versions identify specific language semantics referred to as “undefined behavior,” described in C11 as:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Not only does the list of undefined behavior semantics change from version to version, the implementation also changes from compiler to compiler. Many programmers use these undefined behaviors to optimize their application for a specific platform where the code is compiled with a particular version of the compiler. However, if you compile the code for another platform or using a different compiler, these optimizations may result in defects.
Now add C++, Ada, Java, and the multitude of other languages being used in projects today, and the areas of risk increase exponentially. Not all of these languages display the same type of undefined behaviors that C does, but they all have risks associated with their use. Implementing coding standards encapsulate the risks inherent to domain-specific applications.
Given that most safety-critical devices are designed to be fail-safe (i.e., when failures do occur, they are detected and the system responds to them in a manner that ensures the safety of the users), there’s no room for these kinds of “undefined behaviors.” And, even security-critical applications have no room for these kinds of errors since failures can be the way that system vulnerabilities are exploited. While the level of risk associated with creating a safety-critical device is not the same as the level of risk associated with creating a secure device with security, the lines are blurring as more and more devices become network-enabled. Bottomline: Safe software cannot exist with undefined behavior.