Are text strings a vulnerability in embedded software? -

Are text strings a vulnerability in embedded software?

For many years, the security of desktop computers has been a concern. Once a machine is connected to the Internet, there is intrinsically a possibility for some kind of attack. Such infiltrations might be to steal data, damage the system or change its operation in some way. Means of protection are well known and widely applied. Embedded systems always seemed to be immune from such problems, as they were rarely networked, and their code was normally in ROM of some kind. Things have changed. A large proportion of modern systems are connected to the Internet and it is common practice for code to be copied into RAM and executed from there. This means that security is now an important embedded software design consideration.

The English language is a great communications tool. It is a very expressive language that enables communication with great precision and subtlety. However, in everyday speech, most of us are lazy and often use words without 100% accuracy. The example I have in mind here is the way that safety and security are used almost interchangeably, as if they were synonyms. I think that the best definition that I have heard goes something like this: safety is the process of protecting the world from the device; security is protecting the device from the world. Security is my topic for today.

If a system really needs to be totally bullet-proof, industrial grade encryption is called for. This normally requires specific hardware support, which, whilst readily available, might be considered overkill for an application where such high security is not necessary. In such cases, there are other options and that is what I would like to explore.

If a hacker can gain access to a device’s memory contents, they can start to figure out what it does and how it does it. This is the first stage in altering its operation. Code may be dis-assembled and, hence, the logic can be revealed. Without encryption, there is little that can be done to prevent this. The next thing the hacker might do is look at a hex/ASCII dump of the data and see what they can find there that makes sense. They are looking for patterns and recognizable structures. This is where some precautions may be taken. Whilst encryption may not be an option, obfuscation is a possibility.

The goal of data obfuscation is to delay or deter the hacker by simply making the data less recognizable for what it is. Scanning through a memory dump, one of the easy things to spot is text strings. So, this is what I will focus on here.

In C/C++ code, text strings are normally just sequences of bytes containing ASCII codes terminated by a null byte. That is very easy to spot, so I will change it. First, instead of the null terminator, the first byte of each string will be a length specifier. The characters of the string will have their data scrambled slightly, to make them less familiar looking – all I will do is swap the two nibbles of each byte. I need to have a utility program into which I would feed the plain text strings and it generates the declaration for an array with appropriate initialization. Here is the function at the heart of this utility:

void scramble(int index, unsigned char *input)
    unsigned char *charpointer, character;
    printf("unsigned char string%d[%d] = {0x%02x,", index,
           strlen(input)+1, strlen(input));
    charpointer = input;
        character = *charpointer++;
        character = ((character & 0x0f) << 4) |
                    ((character & 0xf0) >> 4);
        printf("0x%02x", character);
        if (*charpointer)
    printf("};  // \"%s\"\n", input);

If I passed this function an index of 4 and a string “Hello world” (original eh?), the output would be:

unsigned char string4[12] = {0x0b, 0x84, 0x56, 0xc6, 0xc6, 0xf6, 0x02, 0x77, 0xf6, 0x27, 0xc6, 0x46};  // "Hello world"

I can copy and paste this into my code, then all I need to do is write a function to unscramble the text when I need to display it. Instead of giving each string an index number, I could give it an arbitrary name by replacing the index parameter with a string. Note that the generated code is somewhat self-documenting, as the comment shows the string in a readable form, but, of course, this only appears in the source code. If the hacker has access to your source code, then you are in sufficient trouble that I am unable to help further!

Here is some code to illustrate the unscrambling process:

void main()
    unsigned char temp, buffer[50];
    int count = string4[0], index=0;
        temp = string4[index+1];
        temp = ((temp & 0x0f) << 4) | ((temp & 0xf0) >> 4);
        buffer[index] = temp;
    buffer[index] = 0;
    printf("-%s-\n", buffer);

The swapping of nibbles in each byte is one of many different ways that the scrambling can be done. Another possibility is to, say, left-rotate each character by three bits. Here is some code to do just that:

unsigned char leftrotate3(unsigned char c)
    c = (c << 3) | (c >> 5);
    return c;

The obfuscation techniques that I have outlined scramble the string on a character by character basis. It would be possible to do things to the whole string instead. For example, treat the string as a long sequence of bits and rotate it an arbitrary number to the left. I will leave the coding of this algorithm to the more enthusiastic reader.

It is worth noting that a side-effect of localizing all the text strings is that making different versions of the software for other languages is quite straightforward.

I must reiterate and emphasize that data obfuscation is far from bullet proof and will, at best, slow down the serious hacker. If nothing else, the unscrambling code could be dis-assembled. The trick with this technique is to make the obfuscation a difficult trail to follow. If you need really greater security, you must look at full encryption.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.