Using static analysis to make open source Web applications more secure

David and Mike Kleidermacher, Green Hills Software

November 25, 2013

David and Mike Kleidermacher, Green Hills SoftwareNovember 25, 2013

Editor’s Note: Excerpted from their book Embedded Systems Security, the authors demonstrate how static analysis can be used to find and eliminate coding errors. They use as their case study three popular safety critical open source applications - Apache, OpenSSL and sendmail – and analyze them using Green Hill’s DoubleCheck analyzer.

The Apache open source hypertext transfer protocol (HTTP) server is the most popular web server in the world, powering a majority of the websites on the Internet. Given the ubiquity of Apache and the world’s dependence on the Internet, the reliability and security of Apache represent an important concern for all of us. A serious flaw in Apache could cause widespread inconvenience, financial loss, or worse. The Apache web server consists of approximately 200,000 lines of code, 80,000 individual executable statements, and 2,000 functions.

OpenSSL is an open source implementation of Secure Sockets Layer (SSL) and Transport Layer Security (TLS) as well as a comprehensive cryptographic algorithm library. TLS is the modern reimplementation of SSL, although SSL is often used as a general term covering both protocols.

SSL forms the basis of much of the secure communication on the Internet. For example, SSL is what enables users to send private credit card information securely from their browsers to an online merchant’s remote server. In addition to being intimately involved with data communication, OpenSSL contains implementations of a variety of cryptographic algorithms used to secure the data in transit.

OpenSSL is available for Windows; however, OpenSSL is the standard SSL implementation for Linux and UNIX worldwide. In addition, because of its liberal licensing terms (not GPL), OpenSSL has been used as a basis for a number of commercial offerings. Like Apache, OpenSSL is a keystone of worldwide secure Internet communication.

Flaws in this software could have widespread deleterious consequences. OpenSSL consists of approximately 175,000 lines of code, 85,000 individual executable statements, and 5,000 functions.

Although its use is in decline, sendmail is among the most popular electronic mail server software used in the Internet. Sendmail has been the de facto electronic mail transfer agent for UNIX (and subsequently, Linux) systems since the early 1980s.

Given the dependence on electronic mail, the stability and security of sendmail is certainly an important concern for many. The name sendmail might lead one to think that this application is not very complicated. Anyone who has ever tried to configure a sendmail server knows otherwise. Sendmail consists of approximately 70,000 lines of code, 32,000 individual executable statements, and 750 functions.

Output of a Static Source Code Analyzer
Many leading source code analyzers generate an intuitive set of web pages, powered by an integrated web server. The developer can browse high-level summaries of the different flaws found by the analyzer and then click on hyperlinks to investigate specific problems.

Within a specific problem display, the error is displayed inline with the surrounding code, making it easy to understand. Function names and other objects are hyperlinked for convenient browsing of the source code. Since the web pages are running under a web server, the results can easily be shared and browsed by any member of the development team.

The following sections provide examples of actual flaws in Apache, OpenSSL, and sendmail that were discovered by DoubleCheck. The results are grouped by error type, with one or more examples of each error type per section:

  1. potential NULL pointer access;
  2. buffer underflow; and
  3. resource leaks.

Potential NULL Pointer Access
By far the most common flaw found by the analyzer in all three suites under testing was potential NULL pointer access. Many cases involved calls to memory allocation subroutines that were followed by accesses of the returned pointer without first checking for a NULL return.

This is a robustness issue. Ideally, all memory allocation failures are handled gracefully. If there is temporary memory exhaustion, service may falter but not terminate. This is of particular importance to server programs such as Apache and sendmail. Algorithms can be introduced that prevent denial of service in overload conditions such as that caused by a malicious attack.

The Apache web server, sendmail, and OpenSSL all make profligate use of C runtime library dynamic memory allocation. Unlike Java, which performs automatic garbage collection, dynamic memory allocation using the standard C runtime requires that the application itself handle potential memory exhaustion errors. If a memory allocation fails and returns a NULL pointer, a subsequent unguarded reference of the pointer is all but guaranteed to cause
a fatal crash.

In the Apache source file scoreboard.c, we have the following memory allocation statement:

ap_scoreboard_image =
  calloc(1,sizeof(scoreboard) + server_limit *
  sizeof(worker_score *) + server_limit *
  lb_l imit * sizeof(lb_score *));

Clearly, the size of this memory allocation could be substantial. It would be a good idea to make sure that the allocation succeeds before referencing the contents of ap_scoreboard_image. However, soon after the allocation statement, we have this use:

ap_score_board_image->global = (global_score

The dereference is unguarded, making the application susceptible to a fatal crash. Another example from Apache can be found in the file mod_auth_digest.c:

entry = client_list->
table[idx]; prev = NULL;

while (entry->next){/* find last entry */ prev = entry;

entry = entry->next;


The variable entry is unconditionally dereferenced at the beginning of the loop. This alone would not cause the analyzer to report an error. At this point in the execution path, the analyzer has no specific evidence or hint that entry could be NULL or otherwise invalid. However, the following statement occurs after the loop:

if (entry) {

By checking for a NULL entry pointer, the programmer has indicated that entry could be NULL. Tracing backward, the analyzer now sees that the previous dereference to entry at the top of the loop is a possible NULL reference.

The following similar example was detected in the sendmail application, in the file queue.c, where the code unconditionally dereferences the pointer variable tempqfp:

errno = sm_io_error(tempqfp);

sm_io_error is a macro that resolves to a read of the tempqfp->f_ flags field. Later in the same function, we have this NULL check:

if (tempqfp != NULL) sm_io_close(tempqfp,

In addition, there are no intervening writes to tempqfp after the previously noted dereference. The NULL check, of course, implies that tempqfp could be NULL; if that were ever the case, the code would fault. If the pointer can never in practice be NULL, then the extra check is unnecessary and misleading. What may seem harmless sloppiness can translate into catastrophic failure under certain conditions.

In sendmail, there are many other examples of unguarded pointer dereferences that are either preceded or followed by NULL checks.

The final example in this category comes from OpenSSL, in file ssl_lib.c:

if (s->handshake_func == 0) {     

Shortly thereafter, we have a NULL check of the pointer s:

if ((s != NULL) && !SSL_in_init(s))

Again, the programmer is telling us that s could be NULL, yet the preceding deference is not guarded.

< Previous
Page 1 of 2
Next >

Loading comments...