Practical Dynamic Taint Analysis for Countering Attacks on Web

Web applications have been playing a critical role in many areas such as financial transactions, commercial business, cyber community services. As a consequence, web applications also become interesting targets for attackers.

Most efforts have been put on detecting and preventing buffer overflow attacks, however, there are less research work on attacks that exploit input validation vulnerabilities, which have been shown as a more significant problem in web applications. Input validation attacks are the No. 1 type of web attacks.

Web applications usually take inputs from users through form fields, cookies, or some other standard channels, and use these input data in further processing operations, such as querying databases, generating web pages, or executing commands. Because the input data are from the remote users and may contain malicious values, they need to be validated before use.

Once a web application fails to do so, attacks can exploit the vulnerabilities to launch particular attacks. Examples of popular input validation attacks are SQL injections, cross site scripting, and command injections. These attacks can cause many serious problems, such as leak of sensitive information and corruption of critical data.

One way to detect the above types of attacks is to properly sanitize user input, for example, to ensure that no special characters or SQL keywords are used in untrusted input. This can work, but it requires the developers to perform the checks.

As illustrated by the buffer overflow problems, developers often do not have the time, know-how or willingness to check their code for all security vulnerabilities. Even if they did introduce security checks in their code, these checks may be incomplete, still leaving the door open for attackers to sneak through. Consequently, it is important to develop techniques that can largely automate the process.

Several static analysis approaches have been proposed to address this problem. These approaches use type analysis or pointer analysis on the source code to statically examine whether a user input will be used in security sensitive operations along some execution paths in a program. Any such instance will be reported as a possible vulnerability. Because static analysis always need to make approximations, the analysis results are usually not very accurate and false alarms can be very high.

Furthermore, it is legitimate for web applications to use user inputs in even secu- rity sensitive operations as long as these inputs are properly sanitized. It just makes the analysis results even less welcome by reporting every link from a user input to a security operation as a vulnerability.

Recently dynamic information flow tracking techniques have been proposed to detect control flow hijacking attacks such as buffer overflow attacks and format string attacks. In this paper, we present a runtime information flow tracking technique to detect input validation attacks in web applications.

In this paper, we also present a dynamic taint analysis technique to detect the input validation attacks. More specifically, our technique is based on tracking flow of taint information from untrusted input into the parts of the generated output (or commands). A unique benefit of our approach is that it can be applied to all of the web application development languages whose interpreters are implemented in C. We demonstrate this ability by applying our technique on web applications which use PHP, and bash scripts.

Our technique is implemented as a fully automatic source-to-source transformation. We present experimental evaluation to establish effectiveness of our approach, paying particular attention to its attack detection ability. Experiments demonstrate that our technique detects all the attacks accurately with no false alarms.

To read this external content in full, download the complete paper from the author online archives at Stony Brook University. 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.