Decompiling the ARM architecture code

Serge Sourjko and Robert Krten

March 08, 2010

Serge Sourjko and Robert KrtenMarch 08, 2010

At UBM TechInsights, we're often tasked with proving patent infringement of a software algorithm as part of our IP Management Services. An embedded algorithm can range from a sensoring technique in an appliance, to motor control, to power management scheme, to navigation algorithm, to UI control or file system on a higher end embedded device; to name a few examples. Investigating a possible patent infringement is one of the few cases where reverse engineering software is legal in spite of any license agreement to the contrary.

An issue for projects of this nature is that most modern machine code is produced from C or C++, and the process of generating machine code by an optimizing compiler is very sophisticated. Therefore, looking at low-level (machine or assembly language) instructions is a cumbersome and error-prone way of ascertaining infringement.

Decompilation is the process of taking machine language instructions and translating them into a higher-level language representation. Decompilation is more typically used for analysis of computer viruses and malware, and, sometimes to recover lost source code or make a compatible product. One popular example of a decompiler is from Hex-Rays, who sells a very good decompiler for the i386 platform as a plug-in for its IDAPro dissassembler.

Our example for this article is based on one of the most popular assembly languages for high-volume high-value consumer electronics and many other embedded devices-the ARM architecture. We found that available decompilers for ARM produce poor quality code, so we adapted and expanded the open source "Desquirr" decompiler for our needs.

Example

First, let's consider a simple C program:

main (int argc, char **argv)

{

inti;

intsum;

sum = 0;

for (i = 0; i < 1000; i++) {

sum += i;

}

printf ("The sum of 0..999 is %d\n", sum);

}

The program demonstrates assignments, mathematical operations, variables, a comparison operation and conditional flow.

Compilation and output using stock Desquirr

Compiling this program with gcc from arm32 target (with maximum optimization, -O3) yields the following:

main:

MOV R1, #0

MOV R0, #0x3E4

MOV R3, R1

ADD R2, R0, #3

loc_8478:

ADD R1, R1, R3

ADD R3, R3, #1

CMP R3, R2

BLE loc_8478

LDR R0, =aThe SumOf0__999

B .printf

< Previous
Page 1 of 5
Next >

Loading comments...