Back to the Basics - Practical Embedded Coding Tips: Part 1
Reentrancy, atomic variables and recursion
By Jakob Engblom
Embedded.com
(04/19/09, 12:16:00 PM EDT)
Virtually every embedded system uses interrupts; many support multitasking or multithreaded operations. These sorts of applications can expect the program's control flow to change contexts at just about any time. When that interrupt comes, the current operation is put on hold and another function or task starts running. What happens if functions and tasks share variables? Disaster surely looms if one routine corrupts the other's data.

By carefully controlling how data is shared, we create reentrant functions, those that allow multiple concurrent invocations that do not interfere with each other. The word "pure" is sometimes used interchangeably with "reentrant."

Reentrancy was originally invented for mainframes, in the days when memory was a valuable commodity. System operators noticed that a dozen or hundreds of identical copies of a few big programs would be in the computer's memory array at any time. At the University of Maryland, my old hacking grounds, the monster Univac 1108 had one of the early reentrant FORTRAN compilers.

It burned up a breathtaking (for those days) 32 kW of system memory, but being reentrant, it required only 32 k even if 50 users were running it. Everyone executed the same code, from the same set of addresses. Each person had his or her own data area, yet everyone running the compiler quite literally executed identical code. As the operating system changed contexts from user to user it swapped data areas so one person's work didn't affect any other. Share the code, but not the data.

In the embedded world a routine must satisfy the following conditions to be reentrant:

Rule # 1. It uses all shared variables in an atomic way, unless each is allocated to a specific instance of the function.
Rule # 2. It does not call nonreentrant functions.
Rule 3. It does not use the hardware in a nonatomic way.

Atomic Variables
Both the first and last rules use the word "atomic," which comes from the Greek word meaning "indivisible." In the computer world "atomic" means an operation that cannot be interrupted. Consider the assembly language instruction:

mov ax,bx

Since nothing short of a reset can stop or interrupt this instruction it's atomic. It will start and complete without any interference from other tasks or interrupts. The first part of Rule #1 requires the atomic use of shared variables. Suppose two functions each share the global variable "foobar." Function A contains:

temp=foobar;
temp+=1;
foobar=temp;

This code is not reentrant, because foobar is used nonatomically. That is, it takes three statements to change its value, not one. The foobar handling is not indivisible; an interrupt can come between these statements, switch context to the other function, which then may also try and change foobar.

Clearly there's a conflict, foobar will wind up with an incorrect value, the autopilot will crash, and hundreds of screaming people will wonder, "Why didn't they teach those developers about reentrancy?" Suppose, instead, function A looks like:

foobar+=1;

Now the operation is atomic, an interrupt will not suspend processing with foobar in a partially changed state, so the routine is reentrant.

Except . . . do you really know what your C compiler generates? On an x86 processor the code might look like:

movax,[foobar]
incax
mov[foobar],ax

which is clearly not atomic, and so not reentrant. The atomic version is:

inc[foobar]

The moral is to be wary of the compiler; assume it generates atomic code and you may find 60 Minutes knocking at your door.

The second part of the first reentrancy rule reads " . . . unless each is allocated to a specific instance of the function." This is an exception to the atomic rule that skirts the issue of shared variables.

An instance is a path through the code. There's no reason a single function can't be called from many other places. In a multitasking environment it's quite possible that several copies of the function may indeed be executing concurrently. (Suppose the routine is a driver that retrieves data from a queue; many different parts of the code may want queued data more or less simultaneously.) Each execution path is an "instance" of the code. Consider:

int foo;
void some_function(void){
foo++;
}

foo is a global variable whose scope exists beyond that of the function. Even if no other routine uses foo, some_function can trash the variable if more than one instance of it runs at any time. C and C++ can save us from this peril. Use automatic variables. That is, declare foo inside of the function. Then, each instance of the routine will use a new version of foo created from the stack, as follows:

void some_function(void){
int foo;
foo++;
}

Another option is to dynamically assign memory (using malloc), again so each incarnation uses a unique data area. The fundamental reentrancy problem is thus avoided, as it's impossible for multiple instances to stamp on a common version of the variable.

Two More Rules
The rest of the rules are very simple.

Rule #2 tells us a calling function inherits the reentrancy problems of the callee. That makes sense, if other code inside the function trashes shared variables; the system is going to crash.

Using a compiled language, though, there's an insidious problem. Are you sure—really sure—that the runtime package is reentrant? Obviously string operations and a lot of other complicated things use runtime calls to do the real work. An awful lot of compilers also generate runtime calls to do, for instance, long math, or even integer multiplications and divisions.

If a function must be reentrant, talk to the compiler vendor to ensure that the entire runtime package is pure. If you buy software packages (like a protocol stack) that may be called from several places, take similar precautions to ensure the purchased routines are also reentrant.

Rule #3 is a uniquely embedded caveat. Hardware looks a lot like a variable, if it takes more than a single I/O operation to handle a device, reentrancy problems can develop.

Consider Zilog's SCC serial controller. Accessing any of the device's internal registers requires two steps: first write the register's address to a port, then read or write the register from the same port, the same I/O address. If an interrupt comes between setting the port and accessing the register another function might take over and access the device. When control returns to the first function the register address you set will be incorrect.

Keeping Code Reentrant
What are our best options for eliminating nonreentrant code? The first rule of thumb is to avoid shared variables. Globals are the source of no end of debugging woes and failed code. Use automatic variables or dynamically allocated memory.

Yet globals are also the fastest way to pass data around. It's not entirely possible to eliminate them from real time systems. Therefore, when using a shared resource (variable or hardware) we must take a different sort of action.

The most common approach is to disable interrupts during nonreentrant code. With interrupts off the system suddenly becomes a single-process environment. There will be no context switches. Disable interrupts, do the nonreentrant work, and then turn interrupts back on.

Most times this sort of code looks like:

long i;
void do_something(void){
    disable_interrupts();
    i+=0x1234;
    enable_interrupts();
}

This solution does not work. If do_something() is a generic routine, perhaps called from many places, and is invoked with interrupts disabled, it returns after turning them back on. The machine's context is changed, probably in a very dangerous manner.

Don't use the old excuse "yeah, but I wrote the code and I'm careful. I'll call the routine only when I know that interrupts will be on." A future programmer probably does not know about this restriction, and may see do_something() as just the ticket needed to solve some other problem . . . perhaps when interrupts are off.

Better code looks like:

long i;
void do_something(void){
    push interrupt state;
    disable_interrupts();
    i+=0x1234;
    pop interrupt state;
}

Shutting interrupts down does increase system latency, reducing its ability to respond to external events in a timely manner. A kinder, gentler approach is to use a semaphore to indicate when a resource is busy. Semaphores are simple on-off state indicators whose processing is inherently atomic, often used as "in-use" flags to have routines idle when a shared resource is not available.

Nearly every commercial real time operating system includes semaphores; if this is your way of achieving reentrant code, by all means use an RTOS. Don't have an RTOS? Sometimes I see code that assigns an in-use flag to protect a shared resource, like this:

while (in_use);          //wait till resource free
in_use=TRUE;         //set resource busy
Do non-reentrant     stuff
in_use=FALSE;       //set resource available

If some other routine has access to the resource it sets in_use true, causing this routine to idle until in_use gets released. Seems elegant and simple, but it does not work. An interrupt that occurs after the while statement will preempt execution. This routine feels it now has exclusive access to the resource, yet hasn't had a chance to set in_use true.

Some other routine can now get access to the resource. Some processors have a test-and-set instruction, which acts like the in-use flag, but that is interrupt-safe. It'll always work. The instruction looks something like:

Tset variable  ;  if  (variable==0){
                        ;       variable=1;
                        ;       returns TRUE;}
                        ;  else {returns FALSE;}

If you're not so lucky to have a test-and-set, try the following:

loop:      mov     al,0             ; 0 means "in use"
              xchg    al,variable
             cmp      al,0
             je          loop             ; loop if in use

If al = 0, we swapped 0 with zero; nothing changed, but the code loops since someone else is using the resource. If al = 1, we put a 0 into the "in use" variable, marking the resource as busy. We fall out of the loop, now having control of the resource. It'll work every time.

Recursion
No discussion of reentrancy is complete without mentioning recursion, if only because there's so much confusion between the two. A function is recursive if it calls itself. That's a classic way to remove iteration from many sorts of algorithms.

Given enough stack space this is a perfectly valid, though tough to debug, way to write code. Since a recursive function calls itself, clearly it must be reentrant to avoid trashing its variables. So all recursive functions must be reentrant, but not all reentrant functions are recursive.

Next in Part 2: Asynchronous hardware/firmware.

Jakob Engblom (jakob@virtutech.com) is technical marketing manager at at Virtutech. He has a MSc in computer science and a PhD in Computer Systems from Uppsala University, and has worked with programming tools and simulation tools for embedded and real-time systems since 1997. 

He was a contributor of  material to "The Firmware Handbook," edited by Jack Ganssle, upon which this series of articles was based and printed with permission from Newnes, a division of Elsevier. Copyright 2008.  For other publications by Jakob Engblom, see www.engbloms.se/jakob.html.