Static vs. dynamic analysis for secure code development: Part 1
Editor’s Note: In this article, excerpted from Embedded System Security by David and Mike Kleidermacher, the authors evaluate the strengths and weaknesses of static and dynamic code analysis in the development of secure C or C++ code.
Use of static analysis should be a required part of every security-conscious software organization’s development process. Which static analyzer should an organization use? The best answer to this question is that a development organization should use multiple tools from different vendors.
Empirical use within government software safety and security evaluation teams has demonstrated that a surprising majority of software ﬂaws caught by one static analyzer will not be caught by an other tool, and vice versa. Many forms of full-program static analysis are inherently intractable, requiring carefully tuned heuristic algorithms to provide high-quality results.
The best coverage for software ﬂaw detection via static analysis requires multiple tools from multiple vendors to be used in concert. In addition to accuracy, there are large differences in the execution time of static analyzers.
Static Source Code Analysis basics
Static source code analyzers attempt to ﬁnd code sequences that, when executed, could result in buffer overﬂows, resource leaks, or many other security and reliability problems. Source code analyzers are effective at locating a signiﬁcant class of ﬂaws that are not detected by compilers during standard builds and often go undetected during runtime testing as well.
Most static source code analyzers use the same type of compiler front end that is used to compile code. In fact, ideally, a static source code analyzer should be integrated with the everyday compiler to maximize use and reduce complexity of the tool chain. In addition, integrated checking enables source code parsing to be performed only once instead of twice. The use of a compiler front end is only natural because the analyzer takes advantage of preexisting compiler dataﬂow algorithms to perform its bug-ﬁnding mission.
A typical compiler will issue warnings and errors for some basic potential code problems, such as violations of the language standard or use of implementation-deﬁned constructs. In contrast, a static source code analyzer performs a full program analysis,ﬁnding bugs caused by complex interactions between pieces of code that may not even be in the same source ﬁle (Figure 3.1 below).
The analyzer determines potential execution paths through code, including paths into and across subroutine calls, and how the values of program objects (such as standalone variables or ﬁelds within aggregates) could change across these paths. The objects could reside in memory or in machine registers.
The analyzer looks for many types of ﬂaws. It looks for bugs that would normally compile without error or warning. The following is a list of some of the more common errors that a modern static source code analyzer will detect the following:
- Potential NULL pointer dereferences
- Access beyond an allocated area , otherwise known as a buffer overﬂow
- Writes to potentially read-only memory
- Reads of potentially uninitialized objects
- Resource leaks (e.g., memory leaks and ﬁle descriptor leaks)
- Use of memory that has already been deallocated
- Out-of-scope memory usage (e.g., returning the address of an automatic variable from a subroutine)
- Failure to set a return value from a subroutine
- Buffer and array underﬂows
The static analyzer also has knowledge about how many standard runtime library functions behave. For example, the analyzer knows that subroutines such as free should be passed pointers to memory allocated by subroutines such as malloc. The analyzer uses this information to detect errors in code that calls or uses the result of a call to these functions. The analyzer can also be taught about properties of user-deﬁned subroutines.
For example, if a custom memory allocation system is used, the analyzer can be taught to look for misuses of this system. By teaching the analyzer about properties of subroutines, users can reduce the number of false positives. A false positive is a potential ﬂaw identiﬁed by the analyzer that could not actually occur during program execution. Of course, one of the major design goals of a static source code analyzer is to minimize the number of false positives so that developers can minimize time looking at them.
If an analyzer generates too many false positives, it will become irrelevant because engineers will ignore the output. A modern static source code analyzer is much better at limiting false positives than traditional UNIX analyzers such as lint. However,since a static analyzer is not able to understand complete program semantics,it is not possible to totally eliminate false positives. In some cases, a ﬂaw found by the analyzer may not result in a fatal program fault, but could point to a questionable construct that should be ﬁxed to improve code clarity. A good example of this is a write to a variable that is never subsequently read.