Using your C-compiler to minimize code size.
Low-Level Code Compression. A common transformation on
the target code level is to find common sequences of instructions from
several functions, and break them out into subroutines. This
transformation can be very effective at shrinking the executable code
of a program, at the cost of performing more jumps (note that this
transformation only introduces machine-level subroutine calls and not
full-strength function calls). Experience shows a gain from 10"30% for
this transformation.
Linker. The linker should be considered an integral
part of the compilation system, since there are some transformations
that are performed in the linker. The most basic embedded-systems
linker should remove all unused functions and variables from a program,
and only include the parts of the standard libraries that are actually
used.
The granularity at which program parts are discarded varies, from
files or library modules down to individual functions or even snippets
of code. The smaller the granularity, the better the linker.
Unfortunately, some linkers derived from desktop systems work on a per
file basis, and this will give unnecessarily big code.
Some linkers also perform postcompilation transformations on the
program. Common transformation is the removal of unnecessary bank and
page switches (that cannot be done at compile-time since the exact
allocation of variable addresses is unknown at that time) and code
compression as discussed above extended to the entire program.
Controlling Compiler Optimization
A compiler can be instructed to compile a program with different goals,
usually speed or size. For each setting, a set of transformations has
been selected that tend to work towards the goal—maximal speed (minimal
execution time) or minimal size. The settings should be considered
approximate. To give better control, most compilers also allow
individual transformations to be enabled or disabled.
For size optimization, the compiler uses a combination of
transformations that tend to generate smaller code, but it might fail
in some cases, due to the characteristics of the compiled program. As
an example, the fact that function inlining is more aggressive for
speed optimization makes some programs smaller on the speed setting
than on the size setting.
The example data below demonstrates this, the two programs
were compiled with the same version of the same compiler, using the
same memory and data model settings, but optimizing for speed or size:
Program 1 gets slightly smaller with speed optimization, while
program 2 is considerably larger, an effect we traced to the fact that
function inlining was lucky on program 1. The conclusion is that one
should always try to compile a program with different optimization
settings and see what happens.
It is often worthwhile to use different compilation settings for
different files in a project: put the code that must run very quickly
into a separate file and compile that for minimal execution time
(maximum speed), and the rest of the code for minimal code size. This
will give a small program, which is still fast enough where it matters.
Some compilers allow different optimization settings for different
functions in the same source file using #pragma directives.
Memory Model. Embedded microcontrollers are usually
available in several variants, each with a different amount of program
and data memory. For smaller chips, the fact that the amount of memory
that can be addressed is limited can be exploited by the compiler to
generate smaller code.
An 8-bit direct pointer uses less code memory than a 24-bit banked
pointer where software has to switch banks before each access. This
goes for code as well as data.
For example, some Atmel AVR chips have a code area of only 8 kb,
which allows a small jump with an offset of +/- 4 kb to reach all code
memory, using wraparound to jump from high addresses to low addresses.
Taking advantage of this yields smaller and faster code.
The capacity of the target chip can be communicated to the compiler
using a memory model option. There are usually several different memory
models available, ranging from "small" up to "huge." In general,
function calls get more expensive as the amount of code allowed
increases, and data accesses and pointers get bigger and more expensive
as the amount of accessible data increases.
Make sure to use the smallest model that fits your target chip and
application—this might give you large savings in code size.
Next in Part 6: Writing compiler-friendly C-code
To read Part 1 in this series, go to Reentrancy, atomic variables and recursion.
To read Part 2 in this series, go to Asynchronous Hardware/Firmware
To read Part 3, go to Metastable States
To read Part 4, go to Dealing With Interrupt Latency
Jakob
Engblom (jakob@virtutech.com)
is technical marketing manager at
at Virtutech.
He has a MSc in computer science and a PhD in Computer Systems from
Uppsala University, and has
worked with programming tools and simulation tools for embedded and
real-time systems since 1997.
He was a contributor of
material to "The Firmware Handbook," edited
by Jack Ganssle, upon which this series of articles was based and
printed
with permission from Newnes, a division of Elsevier.
Copyright 2008. For
other publications by Jakob Engblom, see www.engbloms.se/jakob.html.