Virtually every embedded system uses interrupts; many support
multitasking or multithreaded operations. These sorts of applications
can expect the program's control flow to change contexts at just about
any time. When that interrupt comes, the current operation is put on
hold and another function or task starts running. What happens if
functions and tasks share variables? Disaster surely looms if one
routine corrupts the other's data.
By carefully controlling how data is shared, we create reentrant
functions, those that allow multiple concurrent invocations that do not
interfere with each other. The word "pure" is sometimes used
interchangeably with "reentrant."
Reentrancy was originally invented for mainframes, in the days when
memory was a valuable commodity. System operators noticed that a dozen
or hundreds of identical copies of a few big programs would be in the
computer's memory array at any time. At the University of Maryland, my
old hacking grounds, the monster Univac 1108 had one of the early
reentrant FORTRAN compilers.
It burned up a breathtaking (for those days) 32 kW of system memory,
but being reentrant, it required only 32 k even if 50 users were
running it. Everyone executed the same code, from the same set of
addresses. Each person had his or her own data area, yet everyone
running the compiler quite literally executed identical code. As the
operating system changed contexts from user to user it swapped data
areas so one person's work didn't affect any other. Share the code, but
not the data.
In the embedded world a routine must satisfy the following
conditions to be reentrant:
Rule # 1. It uses all shared
variables in an atomic way, unless each is allocated to a specific
instance of the function.
Rule # 2. It does not call
nonreentrant functions.
Rule 3. It does not use the
hardware in a nonatomic way.
Atomic Variables
Both the first and last rules use the word "atomic," which comes from
the Greek word meaning "indivisible." In the computer world "atomic"
means an operation that cannot be interrupted. Consider the assembly
language instruction:
mov ax,bx
Since nothing short of a reset can stop or interrupt this
instruction it's atomic. It will start and complete without any
interference from other tasks or interrupts. The first part of Rule #1 requires the atomic use of
shared variables. Suppose two functions each share the global variable
"foobar." Function A contains:
temp=foobar;
temp+=1;
foobar=temp;
This code is not reentrant, because foobar is used nonatomically.
That is, it takes three statements to change its value, not one. The
foobar handling is not indivisible; an interrupt can come between these
statements, switch context to the other function, which then may also
try and change foobar.
Clearly there's a conflict, foobar will wind up with an incorrect
value, the autopilot will crash, and hundreds of screaming people will
wonder, "Why didn't they teach those developers about reentrancy?"
Suppose, instead, function A looks like:
foobar+=1;
Now the operation is atomic, an interrupt will not suspend
processing with foobar in a partially changed state, so the routine is
reentrant.
Except . . . do you really know what your C compiler generates? On
an x86 processor the code might look like:
movax,[foobar]
incax
mov[foobar],ax
which is clearly not atomic, and so not reentrant. The atomic
version is:
inc[foobar]
The moral is to be wary of the compiler; assume it generates atomic
code and you may find 60 Minutes knocking at your door.
The second part of the first reentrancy rule reads " . . . unless
each is allocated to a specific instance of the function." This is an
exception to the atomic rule that skirts the issue of shared variables.
An instance is a path through the code. There's no
reason a single function can't be called from many other places. In a
multitasking environment it's quite possible that several copies of the
function may indeed be executing concurrently. (Suppose the routine is
a driver that retrieves data from a queue; many different parts of the
code may want queued data more or less simultaneously.) Each execution
path is an "instance" of the code. Consider:
int
foo;
void some_function(void){
foo++;
}
foo is a global variable whose scope exists beyond that of the
function. Even if no other routine uses foo, some_function can trash
the variable if more than one instance of it runs at any time. C and
C++ can save us from this peril. Use automatic variables. That is,
declare foo inside of the function. Then, each instance of the routine
will use a new version of foo created from the stack, as follows:
void
some_function(void){
int foo;
foo++;
}
Another option is to dynamically assign memory (using malloc), again
so each incarnation uses a unique data area. The fundamental reentrancy
problem is thus avoided, as it's impossible for multiple instances to
stamp on a common version of the variable.