Using static analysis to detect coding errors in open source security-critical server applications

Editor’s Note: Excerpted from their book Embedded Systems Security ,  the authors go through an analysis of three popular, security-critical open source applications – Apache, OpenSSL, and sendmail – and demonstrate how static analysis of the underlying C code can be used to find bugs that are often overlook doing a manual inspection.

Many would argue that the code quality of some popular open source applications is expected to be relatively high. As one person put it, “By sharing source code, open source developers make software more robust. Programs get used and tested in a wider variety of contexts than one programmer could generate, and bugs get uncovered that otherwise would not be found.”[1]

Unfortunately, in a complex software application (such as Apache), it is simply not feasible for all flaws to be found by manual inspection. To help demonstrate the types of coding errors that can be efficiently detected and prevented using static source code analysis, we consider a case study of three popular, security-critical open source applications – Apache, OpenSSL, and sendmail – that were analyzed using Green Hills Software’s DoubleCheck static source code analyzer.

Apache is an open source hypertext transfer protocol (HTTP) server, the most popular in the world, powering a majority of the websites on the Internet. Given the ubiquity of Apache and the world’s dependence on the Internet, the reliability and security of Apache represent an important concern for all of us. A serious flaw in Apache could cause widespread inconvenience, financial loss, or worse. The Apache web server consists of approximately 200,000 lines of code, 80,000 individual executable statements, and 2,000 functions.

OpenSSL is an open source implementation of Secure Sockets Layer (SSL) and Transport Layer Security (TLS) as well as a comprehensive cryptographic algorithm library. TLS is the modern reimplementation of SSL, although SSL is often used as a general term covering both protocols.

SSL forms the basis of much of the secure communication on the Internet. For example, SSL is what enables users to send private credit card information securely from their browsers to an online merchant’s remote server. In addition to being intimately involved with data communication, OpenSSL contains implementations of a variety of cryptographic algorithms used to secure the data in transit. OpenSSL is available for Windows; however, OpenSSL is the standard SSL implementation for Linux and UNIX worldwide.

In addition, because of its liberal licensing terms (not GPL), OpenSSL has been used as a basis for a number of commercial offerings. Like Apache, OpenSSL is a keystone of worldwide secure Internet communication. Flaws in this software could have widespread deleterious consequences. OpenSSL consists of approximately 175,000 lines of code, 85,000 individual executable statements, and 5,000 functions.

Sendmail is among the most popular electronic mail server software used in the Internet, although its use is in decline. It has been the de facto electronic mail transfer agent for UNIX (and subsequently, Linux) systems since the early 1980s. Given the dependence on electronic mail, the stability and security of sendmail is certainly an important concern for many.

The name sendmail might lead one to think that this application is not very complicated. Anyone who has ever tried to configure a sendmail server knows otherwise. In fact, sendmail consists of approximately 70,000 lines of code, 32,000 individual executable statements, and 750 functions.

Output of a Static Source Code Analyzer .
Many leading source code analyzers generate an intuitive set of web pages, powered by an integrated web server. The developer can browse high-level summaries of the different flaws found by the analyzer and then click on hyperlinks to investigate specific problems.

Within a specific problem display, the error is displayed inline with the surrounding code, making it easy to understand. Function names and other objects are hyperlinked for convenient browsing of the source code. Since the web pages are running under a web server, the results can easily be shared and browsed by any member of the development team.

The following sections provide examples of actual flaws in Apache, OpenSSL, and sendmail that were discovered by DoubleCheck. The results are grouped by error type, with one or more examples of each error type per section.

Potential NULL Pointer Access
By far the most common flaw found by the analyzer in all three suites under testing was potential NULL pointer access. Many cases involved calls to memory allocation subroutines that were followed by accesses of the returned pointer without first checking for a NULL return. This is a robustness issue. Ideally, all memory allocation failures are handled gracefully. If there is temporary memory exhaustion, service may falter but not terminate. This is of particular importance to server programs such as Apache and sendmail. Algorithms can be introduced that prevent denial of service in overload conditions such as that caused by a malicious attack.

The Apache web server, sendmail, and OpenSSL all make profligate use of C runtime library dynamic memory allocation. Unlike Java, which performs automatic garbage collection, dynamic memory allocation using the standard C runtime requires that the application itself handle potential memory exhaustion errors. If a memory allocation fails and returns a NULL pointer, a subsequent unguarded reference of the pointer is all but guaranteed to cause a fatal crash.

In the Apache source file scoreboard.c, we have the following memory allocation statement:

ap_scoreboard_image = calloc(1,sizeof(scoreboard) + server_limit * sizeof(worker_score *) + server_limit * lb_limit * sizeof(lb_score *));

Clearly, the size of this memory allocation could be substantial. It would be a good idea to make sure that the allocation succeeds before referencing the contents of ap_scoreboard_image. However, soon after the allocation statement, we have this use:

ap_score_board_image->global = (global_score
*)more_storage;

The de-reference is unguarded, making the application susceptible to a fatal crash. Another example from Apache can be found in the file mod_auth_digest.c:

entry = client_list->table[idx]; prev = NULL;
while (entry->next){/* find last entry */ prev = entry;
entry = entry->next;

}

The variable entry is unconditionally dereferenced at the beginning of the loop. This alone would not cause the analyzer to report an error. At this point in the execution path, the analyzer has no specific evidence or hint that entry could be NULL or otherwise invalid. However, the following statement occurs after the loop:

if (entry) {

}

By checking for a NULL entry pointer, the programmer has indicated that entry could be NULL. Tracing backward, the analyzer now sees that the previous dereference to entry at the top of the loop is a possible NULL reference.
The following similar example was detected in the sendmail application,in the file queue.c, where the code unconditionally dereferences thepointer variable tempqfp:

errno = sm_io_error(tempqfp);

sm_io_error is a macro that resolves to a read of the tempqfp->f_ flags field. Later in the same function, we have this NULL check:

if (tempqfp != NULL)
   sm_io_close(tempqfp, SM_TIME_DEFAULT);

In addition, there are no intervening writes to tempqfp after the previously noted dereference. The NULL check, of course, implies that tempqfp could be NULL; if that were ever the case, the code would fault. If thepointer can never in practice be NULL, then the extra check isunnecessary and misleading. What may seem harmless sloppiness cantranslate into catastrophic failure under certain conditions.

Insendmail, there are many other examples of unguarded pointerdereferences that are either preceded or followed by NULL checks.

The final example in this category comes from OpenSSL, in file ssl_lib.c :

if (s->handshake_func == 0) { SSLerr(SSL_F_SSL_SHUTDOWN,      
   SSL_R_UNINITIALIZED);
}

Shortly thereafter, we have a NULL check of the pointer s:

if ((s != NULL) && !SSL_in_init(s))

Again, the programmer is telling us that s could be NULL, yet the preceding deference is not guarded.

Buffer Underflow
Abuffer underflow is defined as an attempt to access memory before anallocated buffer or array. Similar to buffer overflow, buffer underflowscause insidious problems due to the unexpected corruption of memory.The following flaw in file queue.c in sendmail was discovered by staticanalysis:

if ((qd == -1 || qg == -1) &&
   type != 120)
  …
else {
  switch (type) {
  …
  case 120:
    if (bitset(QP_SUBXF,
      Queue[qg]->qg_qpaths[qd].qp_subdirs))
        …
  }

}

As you can see, the if statement implies that it is possible for qd or qg to be e1 when type is 120 . But in the subsequent switch statement, always executed when type is 120 , the Queue array is unconditionally indexed through the variable qg . If qg were e1 , this would be an underflow. The program was not studied exhaustively to determine whether qg can indeed be e1 when type is 120 and hence reach the fault. However, if qg can’t be e1 when type is 120 , then the initial if check is incorrect, misleading, and/or unnecessary.

Another example of buffer underflow is found in file ssl_lib.c in OpenSSL:

p = buf;
sk = s->session->ciphers;
for (i = 0; i < sk_SSL_CIPHER_num(sk); i++) {
  …
  *(p++)=‘:’;
}
p[-1] = ‘’;

The analyzer informs us that the underflow occurs when this code is called from file s_server.c . From a look at the call site in s_server.c, it is clear that the analyzer has detected that buf points to thebeginning of a statically allocated buffer. Therefore, in the ssl_lib.c code, if there are no ciphers in the cipher stack sk , then the access p[e1] is an underflow. This demonstrates the need for an inter-module analysis, since there would be no way of knowing what buf referenced without examining the caller.

If it is the case that the number of ciphers cannot actually be 0 in practice, then the for loop should be converted to a do loop to make it clear that the loop must always be executed at least once (ensuring that p[e1] does not underflow).

Another problem is a potential buffer overflow. No check is made in the ssl_lib.c code to ensure that the number of ciphers does not exceed the size of the buf parameter. Instead of relying on convention, a better programming practice would be to pass in the length of buf and then add code to check that overflow does not occur.

Resource Leak
In file speed.c in OpenSSL:

fds=malloc(multi*sizeof *fds);

fds is a local pointer and is never used to free the allocated memory prior to return from the subroutine. Furthermore, fds is not saved in another variable where it could be later freed.Clearly, this is a memory leak. A simple denial-of-service attack onOpenSSL would be to invoke or cause to be invoked the speed commanduntil all of memory is exhausted.

In addition to this case study,other commercial static code analyzers have been used successfully onlarge open source applications, including the Linux operating system, tolocate numerous latent security vulnerabilities.

Numerousmechanisms are available to help in the struggle to improve softwarequality, including improved testing and design paradigms. But automatedsource code analyzers are one of the most promising technologies.

References:
1. DiBona C., Ockman S., Stone M.; Editors. Voices from the open source revolution . Sebastopol, Ca.; O’Reilly; 1999.

David Kleidermacher , Chief Technology Officer of Green Hills Software ,joined the company in 1991 and is responsible for technology strategy,platform planning, and solutions design. He is an authority in systemssoftware and security, including secure operating systems,virtualization technology, and the application of high robustnesssecurity engineering principles to solve computing infrastructureproblems. Mr. Kleidermacher earned his bachelor of science in computerscience from Cornell University.

This article is excerpted from Embedded Systems Security ,by David and Mike Kleidermacher, and is used with permission fromNewnes, a division of Elsevier. Copyright 2012. All rights reserved.

1 thought on “Using static analysis to detect coding errors in open source security-critical server applications

  1. All these examples underscore the importance of Open Source software for security. As new tools and analysis methods become available, Open Source software can be re-audited. On the other hand, the proprietary software audits are as good as the vendor's di

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.