Static vs. dynamic analysis for secure code development: Part 2

Editor’s Note: In this second article in a series, excerpted from Embedded System Security by David and Mike Kleidermacher, the authors evaluate the strengths and weaknesses of dynamic code analysis in the development of secure C or C++ code. 

A secure development process should also employ dynamic code analysis in addition to static code analysis. A simple example demonstrates this need. The following code will be flagged as an error by a static source code analyzer:

int *getval(void)
{
  return 0;
}
void foo(void)
{
  int *b = getval();
  *b = 0;
}

The pointer b is initialized by the return value from a function call that obviously returns a NULL pointer. Then b, the NULL pointer, is dereferenced. However, the following similar code may not be flagged as an error by a static source code analyzer:

int fd;
int *getval(void)
{
  int *tmp;
  read(fd, &tmp, sizeof(tmp));
  return tmp;
}
void foo(void)
{
  int *b = getval();
  *b = 0;
}

In this example, b is also initialized by the return value from a function call. However, the source code provides no indication of potential return values from the function call. In particular, the return value is read from a file. While the file may well contain an invalid pointer, causing this program to crash, many static analyzers will adopt a conservative approach (to minimize false positives) and will not assume anything specific about the externally read data.

Dynamic analysis uses code instrumentation or a simulation environment to perform checks of the code as it executes. For example, an instrumented program will have a check prior to the dereference of b which validates that b is not NULL. Or a simulator can validate all memory references to check for writes to address 0.

Some compilers have dynamic code analysis instrumentation available as a standard option. The development process should require that these checks be enabled at appropriate stages of development, testing, and integration. 

For example, the Green Hills Software compiler has the option – check = memory, which causes the maximum amount of dynamic analysis instrumentation for various forms of memory errors, including NULL pointer dereferences.

The instrumented code performs the check and then calls a diagnostic function, provided by A library that is automatically linked to the program when using this option, which informs the user that a fault occurred as well as the type and location of the error within the source code, as follows: 

> gcc myfile.c echeck=memory
> ./a.out
Nil pointer dereference on line 15 in file myfile.c

This is one example in which the program likely would have crashed, helping the developer Locate the program, even if dynamic analysis were not enabled. However, many other kinds of failures are far more insidious, leading to subtle corruptions that may go completely unnoticed or cause a downstream failure that is extremely difficult to trace back to its root cause. 

Dynamic analysis detects the fault at its source, turning a thorny bug into a trivial one. Let’s examine a few other examples of dynamic code analysis controls that developers should use during development and testing.

Buffer Overflow
There are many forms of buffer overflow errors, many of which will not be caught by static analysis because the amount of data being written to a buffer is unknown at build time. The following is a simple example:

int an_array[10];
void a_func(int index)
{
  an_array[index] = 0;
}

If the parameter passed to a_func is a value read from a file or message queue by a caller to a_func, most static analyzers will conservatively ignore this array reference. However, if index turns out to be a value greater than nine, a dynamic analyzer will catch the fault, as shown here:

> gcc myfile.c echeck=bounds
> ./a.out
Array index out of bounds on line 50 in file myfile.c

Assignment Bounds
The C and C++ programming languages(especially C)suffer from a lack of strong, compile time enforced-type safety that languages such as Ada and C# provide. However, quality coding standards as well as the use of static and dynamic analysis can provide reasonable compensation for these language limitations. Integer overflow is one risk of weak typing, as shown in the following example:

void assign(unsigned int p)
{
  static volatile unsigned short s;
  s=p;
}
void myfunc(void)
{
  assign(65536);
}

This code fragment is perfectly legal ANSIC; the assignment of p to s is defined to truncate p’s value to fit s’s type. In typical implementations, an unsigned short integer occupies 16 bits of storage, allowing values in the range of 0 to 65,535. However, in the example, a value just beyond this range is passed as the parameter p, clearly a programming error. 

Yet standard compilers will not emit even a warning on the preceding code sequence. Dynamican alysis can detect assignments of values that are out of range for a type, even if the values are read externally (e.g., from a file). The analyzer build command and output for the preceding example may look as follows:

> gcc myfile.c echeck=assignbound
> ./a.out
Assignment out of bounds on line 57 in file myfile.c

Missing Case
Most imperative programming languages, such as C, Cþþ, C#, and Java, have a switch/case selection control equivalent. It is perfectly legal in these languages to have a switch statement whose case arms do not cover all possible values of the control expression type. For example:

typedef enum { red, yellow, blue, green } colors;
void carcolors(colors c)
{
  switch (c) {
    case red:
      printf(“redn”);
      break;
    case yellow:
      printf(“yellown”);
      break;
    case blue:
      printf(“bluen”);
      break;
  }
}

Despite the legality of the preceding code, some compilers and static analyzers will emit a diagnostic, complaining of a lack of case to handle the value of green for switch control c. For example, the open source GCC compiler will emit a warning when passed the –Wall option that enables some checks beyond the language standard, as shown below:

> gcc myfile.c eWall
myfile.c: In function ‘carcolors’:
myfile.c:64: warning: enumeration value ‘blue’ not handled in switch

Some programmers will include a default case arm as a matter of habit to ensure that all possible values of the control variable are handled and avoid such warnings. However, this approach is not always a good idea. A catchall case can lead to unintended consequences in which the default handling is not appropriate for all inputs. 

For this reason, some high assurance coding standards eschew the use of the default case whenever practical and instead promote the use of explicit cases for all expected control values.

In the preceding example, the programmer may know that the cars can be only red, yellow,and blue (no green cars). But what if some day green cars are invented? Will the software be updated to reflect this new reality? 

The preceding carcolor function will compile and execute, but the lack of green handling could have unintended consequences. Once again, in such cases dynamic analysis can be used as a code-quality enforcement mechanism. If a switch statement is passed a value for its control variable that matches no existing case, then the dynamic analyzer will generate a runtime exception:

> gcc myfile.c echeck=switch
> ./a.out
Case/switch index out of bounds on line 7 in file myfile.c

MISRA C rule 15.2 prohibits recursion to avoid runtime stack overflow, but static detection of cycles in a complicated program’s call graph can be difficult due to indirect function calls. Furthermore, programs devoid of recursion can also suffer from stack overflow simply due to a long function call sequence and/or excessive automatic storage usage.

Detecting stack overflow is critical both for reliability and security of embedded systems. Embedded systems are often memory constrained, requiring system designers to carefully allocate and minimize stack usage for all processes and threads. 

Stack overflows may manifest themselves in subtle corruptions that are difficult to track down during development and testing. Overflow vulnerabilities that go undetected during product development may cause fielded programs to crash. Attackers who become aware of stack overflow vulnerabilities can use them to subvert execution in numerous ways. 

For example, a stack overflow triggered by crafted input to one thread may overwrite the data in a second thread, causing it to crash or execute malware.

Whenever possible, a static analysis tool should be used to check for the largest potential runtime stack memory requirements for a program or for all threads in a multi-threaded program.

Your tool chain provider should include a tool for this purpose. However, because of the aforementioned indirect function call dilemma, maximum potential runtime stack memory requirement cannot always be computed statically.

Virtual memory-capable embedded operating systems could employ guard pages to detect stack overflow at runtime. For developers not using a virtual memory operating system, a second option for dynamic analysis of stack overflow is to instrument the program with overflow checks in the prologue of each function call. 

This feature is available in some compilers and may not be appropriate for multi-threaded applications. Building a program that overflows its stack would generate an appropriate runtime error, halting execution when the stack pointer first exceeds the bounds of the allocated runtime stack:

> gcc myfile.c echeck=stack
> ./a.out
Stack overflow

If no documented dynamic stack overflow detection option exists in a tool chain or operating system,a developer should consider the following do-it-yourself method that works reasonably well. 

Most operating systems have a hook for executing a developer-defined function call on every system context-switch as well as a means of reading each thread’s stack pointer and the location of the thread’s allocated stack segment. 

The context-switch function can simply compare the stack pointer of the thread about to be executed with the thread’s runtime stack bounds. 

On most computers, stacks grow downward to lower addresses, so a comparison that shows a stack pointer below the bottom of its allocated stack segment would generate an alarm, audit record, and so on. Readers should consult operating system documentation for the common context-switch hook feature.Memory Leaks
One of the major reliability benefits touted by the Java language is its avoidance of programmer-controlled dynamic heap memory allocation by using automatic garbage collection. 

However, many embedded applications use dynamic memory allocation and suffer from vulnerabilities due to improper memory management errors. Many such errors can be prevented via dynamic code analysis.

Memory leaks are one class of memory management error. A memory leak occurs when a function allocates memory but never releases it. If the function is called sporadically, then the loss of memory may be gradual, escaping detection until a system is field deployed.

Furthermore, if an attacker is aware of a leaking function, it can focus its attention on causing the function to be executed, draining the system of memory resources and forcing a system failure.

A search of the memory leak vulnerabilities in the NIST’s National Vulnerability Database uncovers numerous instances in commercial products, including security appliances. For example, CVE-2010-2836 is a recent high-severity security vulnerability identified in the SSL virtual private network (VPN) feature of Cisco’s network appliance operating system called IOS. The vulnerability enables remote attackers to cause a denial of service via memory exhaustion by improperly disconnecting SSL sessions.

Memory leak detection is a form of dynamic analysis that eliminates programmer leak vulnerabilities. Leak detection works by comparing a program’s pointer references to the memory management library’s outstanding allocations. A program’s pointer references may reside in memory-resident data variables, runtime automatic stack storage, or CPU registers.

The memory leak detector, therefore, is usually offered as a tightly integrated feature of the developer tool chain (compiler, runtime libraries). Memory leaks can occur at any time during a program’s lifetime. The runtime library can perform its memory leak detection algorithm at sensible call points (such as when memory is Allocated or released). 

In addition, the user can add explicit call to the memory leak detection algorithm as a sanity check at regular intervals in time or at specific points in the application code. Leak detection can be performed during debugging, during testing, or even in a fielded product.

Ideally, the memory management library is able to record an execution call stack within its allocation database. When the leak detection algorithm identifies a leak, the call stack can be reported to the developer, making it easy to identify the specific allocation that has been left hanging. A static source code analyzer should detect the simple memory leak error shown below:

void leak(void)
{
  char *buf = malloc(100);
  sprintf(buf, “some stuffn”);
  printf(buf);
}
int main()
{
  leaks();
  __malloc_findleaks(); // call the leak detector
}

However, as with the other cases discussed here, many forms of leaks are beyond the insight of static analysis and require dynamic leak detection. In the preceding example, the leak function allocates memory pointed to by a local variable and Never deallocates the memory.

Upon return from the function, therefore, this memory is leaked. A call to the runtime library’s leak detector will report the leak as shown below:

> gcc myfile.c echeck=memory
> ./a.out
Unreferenced memory adr=0x18d40 allocated at 0x103f4 called from 0x1043c then 0x15f18
then 0x10028

When integrated with the software development environment, the leak detection report’s call stack addresses are mapped to actual source code locations, enabling the developer to more easily locate and understand the leak source (Figure 3.4  below).

 
Click on image to enlarge.

Figure 3.4: Memory leak detection integrated into software development environment.

For the precise name of the leak detection applications programming interface(API) and build time options used to enable leak detection, readers should consult a tool chain supplier.

Other Dynamic Memory Allocation Errors
With programmatic control of memory allocation and deallocation, there are many more ways for developers to shoot themselves in the foot. The following simple function shows a couple more examples:

void badalloc(void)
{
  char *buf = malloc(100);
  char localbuf[100];
  free(buf);
  free(localbuf);
  free(buf);
}

The first call to free(buf) is fine; it references a valid allocated buffer. However, the second call to free(buf) is invalid, since buf has already been deallocated. 

The call to free(localbuf) is also invalid because localbuf is a local buffer, not allocated using a corresponding dynamic memory allocation call such as malloc or calloc. Similar errors in C++ occur with the operators new and delete. 

Once again, static analysis can locate the errors in this example, but dynamic analysis will find other memory allocation errors that static checking cannot. For example, the following change will confuse many static analyzers:

char localbuf[100];
char *b = localbuf;
void badalloc(void)
{
  free(b);
}

Because the variable b is now globally defined, a static source code analyzer may assume less Knowledge about that to which b may point. Dynamic analysis detects the invalid deallocation during program execution:

> gcc myfile.c echeck=memory
> ./a.out
Attempt to free something not allocated adr=0x18484

Figure 3.5: A dynamic analysis error stops the program at the offending line in the debugger, making it easy for the developer to locate and fix common security vulnerabilities.

If dynamic analysis is integrated into the debugger, the preceding failure is even easier for the developer to detect and correct. As shown in Figure 3.5 above, the debugger is automatically halted when the memory deallocation error occurs, pointing the developer to the exact offending line of code.

It goes without saying that software managers should strongly weigh the diagnostic capability of a compiler and tool chain when selecting such an important tool.

Read Part 1

David Kleidermacher , Chief Technology Officer of Green Hills Software, joined the company in 1991 and is responsible for technology strategy, platform planning, and solutions design. He is an authority in systems software and security, including secure operating systems, virtualization technology, and the application of high robustness security engineering principles to solve computing infrastructure problems. Mr. Kleidermacher earned his bachelor of science in computer science from Cornell University.

This article is excerpted from Embedded Systems Security  by David and Mike Kleidermacher, used with permission from Newnes, a division of Elsevier. Copyright 2012. All rights reserved. For more information on this title and other similar books, visit www.newnespress.com.

1 thought on “Static vs. dynamic analysis for secure code development: Part 2

  1. I have just ordered the book “Embedded Systems Security, 1st EditionPractical Methods for Safe and Secure Software and Systems Development” yesterday 19/08/2013 for a project just starting so I am very pleased that I can review the two articles here whilst

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.