CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Back to the Basics - Practical Embedded Coding Tips: Part 5
Using your C-compiler to minimize code size.



Embedded.com
Low-Level Code Compression. A common transformation on the target code level is to find common sequences of instructions from several functions, and break them out into subroutines. This transformation can be very effective at shrinking the executable code of a program, at the cost of performing more jumps (note that this transformation only introduces machine-level subroutine calls and not full-strength function calls). Experience shows a gain from 10"30% for this transformation.

Linker. The linker should be considered an integral part of the compilation system, since there are some transformations that are performed in the linker. The most basic embedded-systems linker should remove all unused functions and variables from a program, and only include the parts of the standard libraries that are actually used.

The granularity at which program parts are discarded varies, from files or library modules down to individual functions or even snippets of code. The smaller the granularity, the better the linker. Unfortunately, some linkers derived from desktop systems work on a per file basis, and this will give unnecessarily big code.

Some linkers also perform postcompilation transformations on the program. Common transformation is the removal of unnecessary bank and page switches (that cannot be done at compile-time since the exact allocation of variable addresses is unknown at that time) and code compression as discussed above extended to the entire program.

Controlling Compiler Optimization
A compiler can be instructed to compile a program with different goals, usually speed or size. For each setting, a set of transformations has been selected that tend to work towards the goal—maximal speed (minimal execution time) or minimal size. The settings should be considered approximate. To give better control, most compilers also allow individual transformations to be enabled or disabled.

For size optimization, the compiler uses a combination of transformations that tend to generate smaller code, but it might fail in some cases, due to the characteristics of the compiled program. As an example, the fact that function inlining is more aggressive for speed optimization makes some programs smaller on the speed setting than on the size setting.

The example data below demonstrates this, the two programs were compiled with the same version of the same compiler, using the same memory and data model settings, but optimizing for speed or size:

Program 1 gets slightly smaller with speed optimization, while program 2 is considerably larger, an effect we traced to the fact that function inlining was lucky on program 1. The conclusion is that one should always try to compile a program with different optimization settings and see what happens.

It is often worthwhile to use different compilation settings for different files in a project: put the code that must run very quickly into a separate file and compile that for minimal execution time (maximum speed), and the rest of the code for minimal code size. This will give a small program, which is still fast enough where it matters. Some compilers allow different optimization settings for different functions in the same source file using #pragma directives.

Memory Model. Embedded microcontrollers are usually available in several variants, each with a different amount of program and data memory. For smaller chips, the fact that the amount of memory that can be addressed is limited can be exploited by the compiler to generate smaller code.

An 8-bit direct pointer uses less code memory than a 24-bit banked pointer where software has to switch banks before each access. This goes for code as well as data.

For example, some Atmel AVR chips have a code area of only 8 kb, which allows a small jump with an offset of +/- 4 kb to reach all code memory, using wraparound to jump from high addresses to low addresses. Taking advantage of this yields smaller and faster code.

The capacity of the target chip can be communicated to the compiler using a memory model option. There are usually several different memory models available, ranging from "small" up to "huge." In general, function calls get more expensive as the amount of code allowed increases, and data accesses and pointers get bigger and more expensive as the amount of accessible data increases.

Make sure to use the smallest model that fits your target chip and application—this might give you large savings in code size.

Next in Part 6: Writing compiler-friendly C-code
To read Part 1 in this series, go to Reentrancy, atomic variables and recursion.
To read Part 2 in this series, go to Asynchronous Hardware/Firmware
To read Part 3, go to Metastable States
To read Part 4, go to Dealing With Interrupt Latency

Jakob Engblom (jakob@virtutech.com) is technical marketing manager at at Virtutech. He has a MSc in computer science and a PhD in Computer Systems from Uppsala University, and has worked with programming tools and simulation tools for embedded and real-time systems since 1997. 

He was a contributor of  material to "The Firmware Handbook," edited by Jack Ganssle, upon which this series of articles was based and printed with permission from Newnes, a division of Elsevier. Copyright 2008.  For other publications by Jakob Engblom, see www.engbloms.se/jakob.html.

1 | 2 | 3 | 4

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS



TECH PAPER
WEBINAR
WEBINAR
WEBINAR




 :