Scrambled eggs

ESC Boston 2011 speaker logo You probably have heard the term Easter egg –when a program contains an intentionally undocumented message, joke, or capability inserted by the program’s developers, as an added challenge to the user or simply just for fun. Easter eggs are commonly found in video games. The Linux application-packaging tool, apt-get , has this bovine egg:

     > apt-get moo         (__)          (oo)   /------/   / |    ||    *  /---/     ~~   ~~   ...."Have you mooed today?"...

Cute. Funny. But what if a developer aims to insert something malicious–a rotten egg? How can your organization be protected from this insider threat? How can you ensure that malware is not inserted by third party middleware or the compiler used to build the software?  Developers and users require something I call assured bit provenance : confidence that every single binary bit of production software originates from its corresponding known good version of source code.

This is a critical aspect of software security that many developers never consider. High assurance security and safety standards, such as DO-178B Level A (aircraft safety) and Common Criteria Evaluated Assurance Level 7 (IT security), require the ability to recreate the exact bits of the final production software from the configuration management system. Ancillary items, not just product source code, must be configuration managed. For example, any scripts used in the creation of production images must be strictly controlled. And the tool chain used to build the software must also be covered. Failure to rigorously control the entire development system can lead to serious vulnerabilities, both inadvertent and malicious.

The Thompson Hack
One such subversion was performed by Ken Thompson and reported famously in his Turing award acceptance speech. Thompson inserted a back door into UNIX in a most clever way that makes for a good story. Thompson’s modification caused the UNIX login password verification to match on a string of Thompson’s choosing in addition to the normal database validation. In essence, Thompson changed the UNIX login program that used to look something like this:

    int login(unsigned int uid, char *password)    {        if (strcmp(pass_dbase(uid), password) == 0)            return true;  // password match, login ok        else            return false; // password mismatch, login fail    }       

into something that looks like this:

      int login(unsigned int uid, char *password)    {        if (strcmp(pass_dbase(uid), password) == 0 ||               strcmp(“ken_thompson”, password) == 0))            return true;  // password match, login ok        else            return false; // password mismatch, login fail    }

However, changing the UNIX source code would be too easy to detect, so Thompson modified the compiler to insert the back door. With compiler insertion, examination of the UNIX source code would not be sufficient to detect the bug. The compiler Trojan would be a code fragment that examines the internal syntax tree of the compiled program looking for the specific login password check code sequence and replacing it with the back door:

    if (!strcmp(function_name(), “login”)) {        if (OBJ_TYPE(obj) == IF_STATEMENT &&                OBJ_TYPE(obj->left) == FUNCTION &&                !strcmp(OBJ_NAME(obj->left), “strcmp”)) {            Object func = GET_ARG(1, obj->left);             if (OBJ_TYPE(func) == FUNCTION) &&                    !strcmp(OBJ_NAME(func),“pass_dbase”)) {                // insert back door                obj = MAKEOBJ(ORCMP, obj,                        MAKEOBJ(FUNCTION, “strcmp”,                        MAKEOBJ(STRING, “ken_thompson”),                         GET_ARG(2, obj->left);            }        }    }

If the compiler is configuration managed and/or peer reviewed, Thompson’s change might be detected by inspection. But if the compiler source code is not under configuration management or is very complicated, the Trojan could go unnoticed for some time.  Also, who would think to question code committed by the esteemed Ken Thompson? One lesson learned is that those with the most trust can cause the most damage: another argument for enforcing least privilege mentality throughout the engineering department.

Assuming that the Trojan in the compiler might be detected, Thompson took his attack a step further and taught the compiler to add the Trojan into itself (two levels of indirection). In other words, the above compiler Trojan was inserted not into the source code of the compiler but rather into the object code of the compiler when compiling itself.  The Trojan is now said to be self-reproducing . While this may sound sophisticated, it really is not difficult once you have a basic understanding of how the compiler works (which of course, Ken Thompson did): simply locate the appropriate compiler phase and insert the above code fragment into the target’s syntax tree when the target is the compiler.

There are ways to detect this more advanced attack. Code inspection is again an obvious method. Another approach is to build the compiler from source (under configuration management), build from the same source again with this new compiler, and require that the two built compiler binaries be bit-for-bit identical. This binary tool comparison method, called bootstrapping , is a cheap and effective method to detect some classes of development tool vulnerabilities. With Thompson’s Trojan just inserted into the compiler source code, the first binary will not contain the Trojan code but the second one will, causing the bootstrap test to fail. Of course, this approach only works if the compiler and the compiler’s compiler have the same target processor back end–i.e. the compiler is self-hosting . Since most UNIX systems have self-hosting compilers, this generational test is effective.

However, to cover his tracks even further, Thompson removed the compiler’s Trojan source code, leaving only a subverted compiler binary which was installed as the default system compiler. Subsequent bootstrapping tests would fail to detect the subversion since both the first and second-generation compilers contain the subversion.

This attack shows how sophisticated attackers can thwart even good development tool security. Ideally, we would like to formally prove correspondence between a tool’s source code and its resulting compiled object code. A practical alternative is to require a bootstrap every time a default tool chain component is replaced. Performing the bootstrap test for every new compiler will generate a chain of trust that would have prevented Thompson’s subversion if this process had been in place prior to his attack.  

Modified condition decision coverage (MCDC) validation of the UNIX login program would also have detected the Thompson hack since the comparison to Thompson’s back door password string will never succeed in normal testing. However, a good defense-in-depth strategy should not assume that testing would find all forms of development tool subversion.

Of course, the configuration management system must be protected from tampering, either via remote network attack or by physical attack of the computers that house the configuration system. Some organizations may go as far as to require that active configuration management databases be kept in a secure vault, accessible only by authorized security personnel.

If your organization is not thinking about development security, now’s the time to start. Happy egg hunting!

Dave Kleidermacher is CTO of Green Hills Software. He writes about security issues, sharing his insights on techniques to improve the security of software for highly critical embedded systems.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.