A pox on globals

October 02, 2006

JackGanssle-October 02, 2006

Globals are the Sirens of embedded systems programming. Don't get sucked in if you don't want to lose your ship or your sanity.

If God didn't want us to use global variables, he wouldn't have invented them. Rather than disappoint God, use as many global as possible.

This must have been the philosophy of a developer I know who wrote a program over 100K lines long that sported a nifty 5,000 global variables. Five thousand. The effect: he was the only person in the universe who could maintain the code. Yet he constantly complained about being "stuck on the project's maintenance."

Then he quit.

Regular readers of this column know I'm obsessed with global variables, the scourge that plunges so many systems into disaster. Globals are seductive; they leer at us as potential resources, crooning "just put one in here, how bad can it be?" Like a teenager mismanaging a credit card, that first careless use too often leads to another and another, until, as Thomas McGuane wrote, "The night wrote a check the morning couldn't cash."

But "globals" is a term that refers to much more than just variables. Any shared resource is a potential for disaster. That's why we all write device drivers to handle peripherals, layering a level of insulation between their real-world grittiness and the rest of our code.

You do religiously use device drivers, don't you? I read a lot of C; it's astonishing how developers sprinkle thoughtless input/output instructions throughout the code like Johnny Appleseed tossing seeds into the wind.

The problem
Globals break the important principle of information hiding. Anyone can completely comprehend a small system with 10 variables and a couple of hundred lines of code. Scale that by an order of magnitude or three, and you'll soon gets swamped in managing implementation details. Is user_input a char or an int? It's defined in some header, somewhere. When thousands of variables are always in scope, it doesn't take much of a brain glitch or typo to enter set_data instead of data_set, which may refer to an entirely unrelated variable.

Next, globals can be unexpectedly stepped on by anyone: other developers, tasks, functions, and interrupt service routines (ISRs). Debugging is confounded by the sheer difficulty of tracking down the errant routine. Everyone is reading and writing that variable anyway; how can you isolate the one bad access out of a million, especially using the typically crummy breakpoint resources offered by most bit-wiggling BDMs?

Globals lead to strong coupling, a fundamental no-no in computer science. Extreme Programming's belief that "everything changes all of the time" rings true. When a global's type, size, or meaning changes it's likely the software team will have to track down and change every reference to that variable. That's hardly the basis of highly productive development.

Multitasking systems and those handling interrupts suffer from severe reentrancy problems when globals pepper the code. An 8-bit processor might have to generate several instructions to do a simple 16-bit integer assignment. Inevitably an interrupt will occur between those instructions. If the ISR or another task then tries to read or set that variable, the Four Horsemen of the Apocalypse will ride through the door. Reentrancy problems aren't particularly reproducible so that once-a-week crash, which is quite impossible to track using most debugging tools, will keep you working plenty of late hours.

Or then there's the clever team member who thoughtlessly adds recursion to a routine that manipulates a global. If such recursion is needed, changing the variable to emulate a stack-based automatic may mean ripping up vast amounts of other code that shares the same global.

Globals destroy reuse. The close coupling between big parts of the code means everything is undocumentably interdependent. The software isn't a collection of packages; it's a web of ineffably interconnected parts.

Finally, globals are addictive. Late at night, tired, crashing from the ¼ber-caffeinated drink consumed hours ago, we sneak one in. Yeah, that's poor design, but it's just this once. That lack of discipline cracks open the door of chaos. It's a bit easier next time to add yet another global; after all, the software already has this mess in it. Iterated, this dysfunctional behavior becomes habitual.

Why do we stop at a red light on a deserted street at 3 a.m.? The threat of a cop hiding behind a billboard is a deterrent, as are the ever-expanding network of red-light cameras. But breaking the rules leads to rule-breaking. Bad habits replace the good ones far too easily, so a mature driver carefully stops to avoid practicing dangerous behavior. In exceptional circumstances, of course, (the kid is bleeding) we toss the rules and speed to the hospital.

The same holds true in software development. We don't use globals as a lazy alternative to good design, but in some exceptional conditions there are no alternatives.

Alternatives to globals
Encapsulation is the anti-global pattern. Shared resources of all stripes should cower behind the protection of a driver. The resource—be it a global variable or an I/O device—is private to that driver.

A function like void variable_set(int data) sets the global (in this case for an int), and a corresponding int variable_get() reads the data. Clearly, in C at least, the global is still not really local to a single function; it's filescopic to both driver routines (more on this later) so both of these functions can access it.

The immediate benefit is that the variable is hidden. Only those two functions can read or change it. It's impossible to accidentally choose the wrong variable name while programming. Errant code can't stomp on the now non-global.

But some additional perks are to be had from this sort of encapsulation.

In embedded systems it's often impossible, due to memory or CPU cycle constraints, to range-check variables. When one is global then every access to that variable requires a check, which quickly burns ROM space and programmer patience. The result is, again, we form the habit of range-checking nothing. Have you seen the picture of the parking meter displaying a total due of 8.1 E+6 dollars? Or the electronic billboard showing a 505 degree temperature? Ariane 5 was lost, to the tune of several hundred million dollars, in part because of unchecked variables whose values were insane. I bet those developers wish they had checked the range of critical variables.

An encapsulated variable requires but a single test, one if statement, to toss an exception if the data is whacky. If CPU cycles are in short supply it might be possible to eliminate even that overhead with a compile-time switch that at least traps such errors at debug time.

Encapsulation yields another cool debugging trick. Use a #define to override the call to variable_set (data) as follows:

#define variable_set(data)
    variable_set_debug(data, __FILE__, __LINE__)
and modify the driver routine to stuff the extra two parameters into a log file, circular buffer, or to a display. Or only save that data if there's an out-of-range error. This little bit of extra information tracks the error to its source. Add code to the encapsulated driver to protect variables subject to reentrancy corruption. For instance:
int variable_get(void){
  int temp;
  push_interrupt_state;
  disable_interrupts;
  temp=variable;
  pop_interrupt_state;
  return temp;
}
Turning interrupts off locks the system down until the code extracts the no-longer-global variable from memory. Notice the code to push and pop the interrupt state; there's no guarantee that this routine won't be called with interrupts already disabled. The additional two lines preserve the system's context. An RTOS offers better reentrancy-protection mechanisms like semaphores. If using Micrium's μC/OS-II, for instance, use the operating-system calls OSSemPend and OSSemPost to acquire and release semaphores. Other RTOSes have similar mechanisms. I mentioned that the ex-global is not really private to a single function. Consider a more complicated example, like handling receive data from a UART, which requires three data structures and four functions:
  • UART_buffer
    A circular buffer that stores data from the UART
  • UART_start_ptr
    The pointer to the beginning of data in the circular buffer
  • UART_end_ptr
    Pointer to the end of the buffer
  • UART_init()
    Sets up the device's hardware and initializes the data structures
  • UART_rd_isr()
    The ISR for incoming data
  • UART_char_avail()
    Tests the buffer to see if a character is available
  • UART_get()
    Retrieves a character from the buffer if one is available

One file—UART.C—contains these functions (though I'd also add the functions needed to send data to the device to create a complete UART handler) and nothing else. Define the filescopic data structures using the static keyword to keep them invisible outside the file. Only this small hunk of code has access to the data. Though this approach does create variables that are not encapsulated by functions, it incurs less overhead than a more rigorous adoption of encapsulation would, and carries few perils. Once debugged, the rest of the system only sees the driver entry points so cannot muck with the data.

Note that the file that handles the UART is rather small. It's a package that can be reused in other applications.

Wrinkles
Encapsulation isn't free. It consumes memory and CPU cycles. Terribly resource-constrained applications might not be able to use it at all.

Even in a system with plenty of headroom there's nothing faster for passing data around than a global. It's not always practical to eliminate them altogether. But their use (and worse, their abuse) does lead to less reliable code. My rule, embodied in my firmware standard, is "no global variables! But . . . if you really need one, get approval from the team lead." In other words, globals are a useful asset managed very, very carefully.

When a developer asks you for permission to use a global, ask him or her, "Have you profiled the code? What makes you think you need those clock cycles? Give me solid technical justification for making this decision."

Sometimes it's truly painful to use encapsulation. I've seen people generate horribly contorted code to avoid globals. Use common sense; strive for a clean design that's maintainable.

Encapsulation has its own yin and yang. Debugging is harder. You're at a breakpoint with nearly all the evidence at hand needed to isolate a problem, except for the value of the encapsulated system's status! What now?

It's not too hard to reference the link map, find the variable's address, and look at the hex. But it's hard to convert a floating-point number's hex representation to human-speak. One alternative is to have a debug mode such that the encapsulated variable_set() function stores a copy, which no other code accesses, in a real global somewhere. Set this inside the driver's reentrancy-protection code so interrupt corruption is impossible.

The other side of the story
A couple of thousand words ago I mentioned a program that had 5,000 globals. When it came time to do a major port and cleanup of the code we made a rule: no globals. But this system was a big real-time system pushing a lot of data around very fast; globals solved some difficult technical problems.

But at the end of the port there were only five global variables. That's a lot more tractable than derailing with 5,000.

Jack Ganssle (jack@ganssle.com) is a lecturer and consultant specializing in embedded systems' development issues. For more information about Jack click here .

Reader Response
I agree that global types are a recipe for broken code. Avoiding them is still highly dependent on system resources. I work with a Microchip micro who's memory resources might be called 'constrained'. Typical variants that we use have 64 bytes of ram and a 8 element stack. The fact that there even exists usable C compilers for these parts was a surprise. Globals are required in order to make the system fit the part. This does not invalidate your basic premise, but does point out that there are uP/applications that just cannot afford 'proper' encapsulation.

- Walter Greene
Software Engineer
Kidde-Fenwal
Ashland, MA
I use the technique:
typedef struct
{
   UINT16 type;  // TYPE_FOO
   // body
} FOO;

FOO global_foo;  // has to live somewhere

void task()
{
     memset( &global_foo, 0, sizeof(FOO) );
     global_foo.type = TYPE_FOO;
     // init rest of body

     action( &global_foo );
}

void action( FOO *foo )
{
   assert( foo );
   assert( foo->type == TYPE_FOO );
   // sanity check the foo fields needed here

    // use fields in foo-> as needed

    other_action( foo );  // same checks

}
While this does not do the exactr same encalsulization of each variable (although it could), the original var (global_foo) could live anywhere - in the original task stack, or as a global. It really depends on its size - it may be too big to live on the stack. However, each function that needs it ONLY refers to it via a pointer that gets checked in each call of the function. This leads to fairly easy re-use, because each function deals with its calling parameters. If you always name the pointer in a consistent manner, cut and paste errors are reduced or eliminated, i.e.:

void bar( FOO *fp )
    {
          assert( fp );
          assert( fp->type == TYPE_ FOO );

           baz( fp->bletch );
    }

you can cut/past the baz() call and not worry you are pointing to the wrong thing.

I have used this technique for many years. You can also take code with globals, put them into a struct, and do the same thing. The compiler will let you know where the references to the globals are.

- mr bandit
nerd
albuquerque, nm


I run into quite a few recent grads who only know the world of PC programming, and to them globals are strictly "evil" because using them in a multithreaded environment yields unpredictable results - so for them only locals are allowed ANYWHERE! (That extreme isn't right EITHER, since not all of us are writing exclusively multithreaded apps YET) And I know you mentioned debugging too, but for some unknown reason if you assign a variable on the stack with almost any popular debugger, it NEVER seems to go "into scope" so you HAVE to use globals SOMEHOW if you ever want to figure out why the code isn't working right. So I wonder if the folks who write multithreaded projects will ever deliver functional, debugged code on time? Anyone know the answer?

- Jeffrey Lawton


I enjoyed your article on globals. There's nothing to dispute there; well said.

One thing I have always found somewhat amusing is that hardware registers are often treated as global variables. While many developers encapsulate the accesses to registers witin drivers, it is not uncommon to encounter coding of the form

MyRegister = value;

I have found value in additional encapsulation along the lines of

write_register(MyRegister, value);

etc because it allows handling several cases important to bringing up new silicon:

1) Register accessors can be used to perform critical sections etc, but can also be used for poor man's debugging, logging all write accesses to the register; flipping the poor man's scope-debug pin, etc.

2) In cases where a particular register was broken in some way, such as one must wait N time units after writing to it before reading it back will give the correct value, the accessor can manage the timing details transparently to the remainder of the code.

3) In some parts the registers can be mapped to different address spaces. For example, on one wayback 8051 project the MCU registers (those in addition to the 8051 core) could be mapped into SFR or XDATA space. That affected their addresses, and had to be handled separately. Thus something like

#ifdef USE_SFR_REGISTERS
#define sfr_decl(sfr, addr) this way
#else
#define sfr_decl(sfr, addr) that way
#endif
sfr_decl(MyRegister, 0x80);

became very useful as no code using the registers needed to know.

4) The hardware team can autogenerate header files with register definitions and accessor macros ready made for the developer to use. This has been very convenient and efficient in practice.

Its a little more work, and using the accessors does make the code arguably harder to read in some cases. But it has saved my bacon on more than one occasion. Mileage may vary. Void where prohibited.

- Trenton Henry


The fact that you even have to write an article on this subject proves that the people who are teaching prospective programmers are not doing a good enough job. This should be one of the first golden rules drilled into the minds of these students.

The use of volatile application-global variables completely circumvents the principle of least privilege.

I have to disagree that they ever have to be used. Besides the fact that they make the code almost impossible to maintain and understand, they also create direct dependencies that render the code un-reusable.

Truly re-usable code must be designed with complete abstraction in mind. This again is something that is rarely taught to todays students.

The use of indirect dependencies (registration via constant pointers, references, etc.) should always be the first choice to enforce these important principles, if access speed is found to be a performance issue. This will always promote the abstraction required for truly re-usable software.

This is true for both embedded and PC based applications.

To say that the use of volatile application global variables is due to a lack a discipline is the understatement of the year.

- Chris Spurrier


I have seen and heard much ado about the evils of globals over the years and find a lot of hot air in most everyone's arguments. Yes, globals can and are abused much like the goto statement. But judiciously used, globals can be effective. By judicious use, I mean to follow the cardinal rule of variable modification which, after my two paragraph diatribe, I will state.

I love OOD/OOA, but in truth you want to know what it really is? OOD is the academically acceptable way to manage globals. You think that by encapsulating the variable in a class, it is no longer global? Well what about those six methods that all have access to it? Any one of them can change its value. And if those six methods comprise say 200 lines of code, is that any better than a traditional global?

In the section titled, "ALTERNATIVES TO GLOBALS", you cite the better programming technique of using a function call to set and get the global variable. Is the function global, if yes, then anyone and everyone can modify the variable just as easily as if they had direct access to the global variable. That is not an improvement.

Everyone needs to abide by one SUPREME CARDINAL RULE governing the modification of variables, whether they be global, local, or class encapsulated. The cardinal rule reads thusly, "A variable should be modified in one and only one place in the code." It is the violation of this rule which leads to complications and unforeseen behavior in systems. You could make every single variable in a system global, but if you follow the cardinal rule you will still have a well-understood system. The system is well-understood because modification of the global is confined and the logic for its modification can be readily analyzed. Follow the cardinal rule! That is the most important overlooked programming rule in the past century. This cardinal rule should be the end objective of all objected oriented programming. This cardinal rule should be the purpose of and justification for data encapsulation.

- Phil Gillaspy


The "varible_set()/variable_get()" encapsulation is kind of mandatory in our company. Additionally, we always create a macro for every interface function and all external calls use those macros:

/* Example */

extern int variable_get(void); #def

ine VAR_GET variable_get()

If later in the development we need to save CPU cycles, we can replace the function's name by the corresponding variable's name without needing to change our C code (only the headers need to be changed):

/* Example */

extern const int variable;

#define VAR_GET variable

The example above shows that we also intentionally add the "const" modifier to prevent external code from writing (if possible).

- Rudolpho Mller


I encourage Chris Spurrier to read the letter from Walter Greene, then come back with a convincing reason to abandon global variables in small systems. I design a variety of projects using the Atmel AVR devices, most of which have relatively small amounts of RAM, and are not designed to access external RAM (even if I wanted to). I use many global variables, because I don't have a lot of memory to spare. My source is typically much less than 1K lines of C code, and when compiled, take 2k to 10k of ROM. Almost all of my variables ARE global, because so many of them have to be shared by the various functions, many of which are interrupt-driven. It does take some careful crafting to deal with this, but in general, things work out very well. I, too, am dissatisfied with the available debuggers that so often give me a "not in scope" error when trying to track a local variable. Perhaps more work needs to be done in this area. At any rate, to make blanket statements such as, "To say that the use of volatile application variables is due to a lack of discipline is the understatement of the year" shows that the writer has perhaps lost sight of the fact that many, many projects have to use resources that are limited, and that we do the best we can with what we have. It isn't a matter of being undisciplined - it's a matter of working with available hardware and performance constraints.

- Dave Telling

Loading comments...

Most Commented

Parts Search Datasheets.com

KNOWLEDGE CENTER