Advertisement

Bail, return, jump, or . . . throw?

March 01, 2007

Dan_Saks-March 01, 2007

The common techniques for handling run-time errors in C leave something to be desired, like maybe exception handling.

The exception handling machinery in C++ is designed to deal with program errors, such as a resource allocation failure or a value out of range. C++ exception handling provides a way to decouple error reporting from error handling. However, it's not designed to handle asynchronous events such as hardware interrupts.

C++ exception handling is designed to address the limitations of error handling in C. In this installment, I'll look at some of the more common techniques for handling run-time errors in C programs and show you why these techniques leave something to be desired.

Error reporting via return values
Many C functions report failures through their function return values or arguments. For example, in the Standard C library:

  • malloc returns a null pointer if it fails to allocate memory.
  • strtoul returns ULONG_MAX and stores the ERANGE into the object designated by errno if the converted value can't be represented as an unsigned long.
  • printf returns a negative value if it can't format and print every operand specified in its format list.

(The macro ULONG_MAX is defined in the standard header <limits.h>. Macros ERANGE and errno are defined in <errno.h>.)

If you want your C code to be reliable, you should write it so that it checks the return values from calls to all such functions. In some cases, adding code to check the return value isn't too burdensome. For example, a typical call to malloc such as:

p = malloc(sizeof(T));
becomes:
p = malloc(sizeof(T));
if (p == NULL)
    // cope with the failure
    
In other cases, writing a proper check is a bit tricky. For example, a call to strtoul such as:
n = strtoul(s, &e, 10);
becomes:
n = strtoul(s, &e, 10);
if (n == ULONG_MAX
    && errno == ERANGE)
    // deal with the overflow
Having detected an error, you then have to decide what to do about it.

Bailing
Some errors, such as a value out of range, might be the result of erroneous user input. If the input is interactive, the program can just prod the user for a more acceptable value. With other errors, such as a resource allocation failure, the system may have little choice other than to shutdown.

The most abrupt way to bail out is by calling the Standard C abort function, as in:

if (something really bad happened)
    abort();
    
Calling abort terminates program execution with no promise of cleaning anything up. Calling the Standard C exit function is not quite as rude:
if (something really bad happened)
    exit(EXIT_FAILURE);
Calling exit closes all open files after flushing any unwritten buffered data, removes temporary files, and returns an integer-valued exit status to the operating system. The standard header <stdlib.h> defines the macro EXIT_FAILURE as the value indicating unsuccessful termination.

You can use the Standard C atexit function to customize exit to perform additional actions at program termination. For example, calling:
atexit(turn_gizmo_off);
"registers" the turn_gizmo_off function so that a subsequent call to exit will invoke:
turn_gizmo_off();
as it terminates the program. The C standard says that atexit should let you register up to 32 functions. Some implementations allow even more.

Embedded systems being as diverse as they are, I suspect that some don't support either abort or exit. In those systems, you must use some other platform-specific function(s) to shut things down.

More commonly, complete shutdown is not the appropriate response to an error. Rather than shut down, the system should transition to a "safe" state, whatever that is, and continue running. Here again, the details of that transition are platform specific.

Returning
Some of the code in any embedded system is clearly application specific. Many systems contain a good chunk of application-independent code as well. The application-independent code could be from a library shipped with the compiler or operating system, from a third-party library, or from something developed in-house.

When an application-specific function detects an error, it can respond on the spot with a specific action, as in:
if (something really bad happened)
    take_me_some_place_safe();
In contrast, when an application-independent function detects an error, it can't respond on its own because it doesn't know how the application wants to respond. (If it did know, it wouldn't be application independent.) Rather than respond at the point where the error was detected, an application-independent function can only announce that the error has occurred and leave the error handling to some other function further up the call chain. The announcement might appear as a return value, an argument passed by address, a global object, or some combination of these. As I described earlier, this is what most Standard C library functions do.

Although conceptually simple, returning error indicators can quickly become cumbersome. For example, suppose your application contains a chain of calls in which main calls f, which calls g, which calls h. Ignoring any concern for error handling, the code would be as shown in Listing 1.



Now, suppose reality intrudes and function h has to check for a condition it can't handle. In that case, you might rewrite h so that it has a non-void return type, such as int, and appropriate return statements for error and normal returns. The function might look like:
int h(void)
    {
    if (something really bad happened)
        return -1;
    // do h
    return 0;
    }
Now g is responsible to heed the return value of h and act accordingly. However, more often than not, functions in the middle of a call chain, such as g and f, aren't in the position to handle the error. In that case, all they can do is look for error values coming from the functions they call and return them up the call chain. This means you must rewrite both f and g to have non-void return types along with appropriate return statements, as in:
int g(void)
    {
    int status;
    if ((status = h()) != 0)
        return status;
    // do the rest of g
    return 0;
    }

int f(void)
    {
    int status;
    if ((status = g()) != 0)
        return status;
    // do the rest of f
    return 0;
    }
Finally, the buck stops with main:
int main()
    {
    if (f() != 0)
        // handle the error
    // do the rest of main
    return 0;
    }
This approach--returning error codes via return values or arguments--effectively decouples error detection from error handling, but the costs can be high. Passing the error codes back up the call chain increases the size of both the source code and object code and slows execution time. It's been a while since I've used this approach to any extent, but my recollection is that the last time I did, it increased the non-comment source lines in my application by 15 to 20%, with a comparable increase in the object code. Other programmers have told me they've experienced increases to the tune of 30 to 40%.

This technique also increases coding effort and reduces readability. It's usually difficult to be sure that your code checks for all possible errors. Static analyzers, such as Lint, can tell you when you've ignored a function's return value, but as far as I know, they can't tell you when you've ignored the value of an argument passed by address. The consistent application of this technique can easily break down when the current maintainer of the code hands it off to a less experienced one.

Jumping
We could eliminate much of the error reporting code from the middle layers of the call chain by transferring control directly from the error-detection point to the error-handling point. Some languages let you do this with a non-local goto. If you could do this in C, it might look like:
int h(void)
    {
    if (something really bad happened)
        goto error_handler;
    // do h
    return 0;
    }

...

int main()
    {
    f();
    // do the rest of main
    return 0;
error_handler:
    // handle the error
    }
but you can't. It won't compile. However, you can do something similar using the facilities provided by the standard header <setjmp.h>. That header declares three components: a type named jmp_buf and two functions named setjmp and longjmp. (Actually, setjmp might be a function-like macro, but for the most part, you can think of it as a function.)

Calling setjmp(jb) stores a "snapshot" of the program's current calling environment into jmp_buf jb. That snapshot typically includes values such as the program counter, stack pointer, and possibly other CPU registers that characterize the current state of the calling environment.

Subsequently, calling longjmp(jb, v) (I'll explain v shortly) effectively performs a non-local goto--it restores the calling environment from snapshot jb and causes the program to resume execution as if it were returning from the call to setjmp that took the snapshot previously. It's like déjà vu all over again.

The function calling setjmp can use setjmp's return value to determine whether the return from setjmp is really that, or actually a return from longjmp. When a function directly calls setjmp(jb) to take a snapshot, setjmp returns 0. A later call to longjmp(jb, v), where v is non-zero, causes program execution to resume as if the corresponding call to setjmp returned v. In the special case where v is equal to 0, longjmp(jb, v) causes setjmp to return 1, so that setjmp only returns 0 when called directly.

Listing 2 shows our hypothetical application with a longjmp from h to main. Since the longjmp bypasses g and f, these two functions no longer need to check for error return values, thus simplifying the source code and reducing the object code.



Using setjmp and longjmp eliminates most, if not all, of the clutter that accrues from checking and returning error codes. So what's not to like about them?

The problem is that you must be extremely cautious with them to avoid accessing invalid data or mismanaging resources. A jmp_buf need not contain any more information than necessary to enable the program to resume execution as if it were returning from a setjmp call. It need not and probably will not preserve the state of any local or global objects, files, or floating-point status flags.

Using setjmp and longjmp can easily lead to resource leaks. For example, suppose functions g and f each allocate and deallocate a resource, as in:

void g(size_t n)
    {
    char *p;
    if ((p = malloc(n)) == NULL)
        // deal with it
    h();
    // do the rest of g
    free(p);
    }

void f(char const *n)
    {
    FILE *inf;
    if ((inf = fopen(n, "r")) == NULL)
        // deal with it
    g(42);
    // do the rest of f
    if (fclose(inf) == EOF)
        // deal with it
    }

A call to longjmp from h transfers control to main, completely bypassing the remaining portions of f and g. When this happens, f misses the opportunity to close its FILE, and g misses the opportunity to free its allocated memory.

C++ classes use destructors to provide automatic resource deallocation. A common practice in C++ is to wrap pointers inside classes, and provide destructors to ensure that the resources managed via these pointers are eventually released. Unfortunately, setjmp and longjmp are strictly C functions that know nothing about destructors. Calling longjmp in a C++ program can bypass destructor calls, resulting in resource leaks.

A forward pass
Using exception handling in C++ can avoid these resource leaks. It properly executes destructors as it transfers control from an error-detection point to an error handler, and it will be the subject of my next column.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at dsaks@wittenberg.edu. For more information about Dan click here .

Loading comments...