Advertisement

Throw and catch

May 01, 2007

Dan_Saks-May 01, 2007

Exception handling in C++ provides the simplicity of error handling with longjmp, but without the leaks.

In large, well-partitioned systems, error-detection code is often decoupled from error-handling code. That is, code capable of detecting an error often doesn't know how to handle it, and code capable of handling the error often doesn't know how to detect it. In fact, the error-detection code might be quite far removed from the error-handling code. Any general error-handling technique should maintain this separation.

As I explained in my previous column, one such error-handling technique is to write each function so that it reports any errors via its return value or arguments.1 Although conceptually simple, returning error indicators can quickly become very cumbersome. Passing the error codes back up the function call chain can add a lot of clutter to the source code, which in turn, increases object-code size and slows execution. Moreover, all that clutter makes it difficult to be sure that your code checks for all possible errors.

Another technique is to use Standard C's longjmp function to implement a non-local goto from the error-detection site to the error-handling site. Using longjmp eliminates most of the clutter that comes from checking and returning error codes and has relatively little impact on the size and speed of the program. Unfortunately, calling longjmp can easily cause resource leaks, as illustrated by Listing 1.

Listing 1:
A C/C++ program with a longjmp from h to main, and resource leaks to boot.


#include <setjmp.h>

jmp_buf error_handler;

void h(void)
    {
    if (something really bad happened)
        longjmp(error_handler, 1);
    // do the rest of h
    }

void g(size_t n)
    {
    char *pc = (char *)malloc(n);
    if (pc == NULL)
        // deal with it
    h();
    // do the rest of g
    free(pc);
    }

void f(char const *n)
    {
    FILE *inf = fopen(n, "r");
    if (inf == NULL)
        // deal with it
    g(42);
    // do the rest of f
    fclose(inf);
    }

int main(void)
    {
    if (setjmp(error_handler) != 0)
        // handle the error
    f("xyzzy");
    // do the rest of main
    return 0;
    }
The main function in Listing 1 calls setjmp to establish an error-recovery point, and then calls f, and then g, and then h. If h detects an error, it calls longjmp to transfer control back to main, completely bypassing the remaining portions of g and f. When this happens, g fails to free its allocated memory, and f fails to close its FILE. Not good.

Destructors
C++ classes use destructors to provide automatic resource deallocation. A common practice in C++ is to wrap pointers or other resource handles inside classes and provide destructors to ensure that the resources managed via these handles are eventually released. For example, function f in Listing 1 uses local pointer object inf to manage a Standard C FILE object. You can provide automatic resource allocation and deallocation for that file by wrapping the pointer inside a class, defined as:

class file
    {
public:
    file(char const *name, char const *mode);
    ~file();
    bool is_open() const;
    ...
private:
    ...
    FILE *pf;
    };
The constructor attempts to open the file named n for reading. The constructor definition might look something like:

file::file(char const *name, char const *mode)
    {
    pf = fopen(name, mode);
    }
A constructor can't have a return type, so the class must use some other way to report failure to open the file. The user of a file object can determine if it failed to open by calling the is_open member function, defined simply as:

inline
bool file::is_open() const
    {
    return pf != NULL;
    }
I described both constructors and const member functions in an earlier column.2 The destructor closes the file. It's defined as:

file::~file()
    {
    if (pf != NULL)
        fclose(pf);
    }
Using this file class, function f from Listing 1 would look like:

void f(char const *n)
    {
    file inf (n, "r");
    if (!inf.is_open())
        // deal with it
    g(42);
    // do the rest of f
    }
The declaration for local object inf provides arguments to the constructor. The compiler generates a call to the file constructor at the site of the declaration of inf. The compiler also generates a call to the file destructor applied to inf as part of the implicit function return at the end of function f.

Unfortunately, even if you rewrite function f from Listing 1 using the file class, calling longjmp from function h terminates f without calling the destructor for inf. Both setjmp and longjmp are C functions that know nothing about C++ destructors. Strictly speaking, if calling longjmp terminates a function that has local objects with destructors, the program has undefined behavior. In this particular case, the undefined behavior usually manifests itself as a failure to call destructors, resulting in resource leaks.

Fortunately, the exception-handling facilities of C++ provide the simplicity of setjmp/longjmp with the additional assurance of automatic resource deallocation.

Throwing and catching exceptions
C++ uses a throw-expression to transfer control and information from the point where an exceptional condition is detected to an exception handler. A throw-expression generally has the form: throw expression; You can use the value of the expression to convey information to the handler about the nature of the exception. The throw operand can have almost any type. For example:
throw -1;
throws an exception of type int whose value is -1. Presumably, the -1 is a code that identifies the error. A throw-expression can throw a string literal, as in:
throw "can't open file";
which actually throws an exception of type "pointer to const char." In this case, the text of the string describes the error. Although you can throw objects of primitive types, it's generally preferable to throw objects of class type, possibly containing data describing the nature of the exception. (Steve Dewhurst wrote a thoughtful essay explaining why.3) The standard C++ header <stdexcept>; defines several such classes with names such as invalid_argument, out_of_range, and overflow_error, all as members of namespace std.

C++ programs use handlers to intercept and respond to exceptions. A handler is also known as a "catch clause" because it has the form:


catch (exception-declaration)
    {
    statement-sequence
    }
A handler can appear only as part of a try-block of the form:

try
    {
    statement-sequence
    }
handler-sequence
For example, the try-block in:

void f()
    {
    try
        {
        g();
        h();
        }
    catch (std::out_of_range &e)
        {
        ...
        }
    }
says informally "Try calling g() and then h(). If something somewhere in either call throws an exception of type std::out_of_range, then handle it here." If a function containing no handlers throws an exception, the program executes the destructors for local objects in that function and transfers control back to the function's caller in search of a handler. If that caller contains no handlers, the program destroys its local objects and terminates that function as well. This process of "unwinding the stack" continues, searching for a handler. A handler's exception-declaration is very much like a function parameter declaration. For example:

catch (int e)
    {
    ...
    }
catches exceptions of type int, and:

catch (std::out_of_range &e)
    {
    ...
    }
catches exceptions of type std::out_of_range, passed by reference. The scope of a name declared in an exception-specification is the body of the handler. The handler-sequence in a try-block is a sequence of one or more handlers. If any statement in the try-block, or any statement in a function called directly or indirectly from that try-block, throws an exception, the program will try matching the type of the exception against the type declared in each handler's exception-declaration. If it finds a match, the program will continue executing with the matching handler. Otherwise, the program terminates the try-block as if it were any other block and continues unwinding the stack. If an exception goes uncaught (the program tries to unwind the stack from main), the program calls the standard function std::terminate, declared as:
void terminate();
This function's default behavior is to call abort, typically after displaying a message of some form. Listing 2 contains a variation of the program from Listing 1 rewritten using C++ exception handling. Function f in that listing uses a local file object, rather than a FILE *, to avoid a resource leak (failing to close the file) during unwinding. The main function contains two handlers: one to catch exceptions thrown from h, and the other to catch exceptions thrown from g and f.

Listing 2:
The program from Listing 1, rewritten using C++ exception handling.

#include <stdexcept>

class catastrophic_failure
    {
    ...
    };

void h(void)
    {
    if (something really bad happened)
        throw catastrophic_failure();
    // do the rest of h
    }

// This function could still leak...
void g(size_t n)
    {
    char *pc = (char *)malloc(n);
    if (pc == NULL)
        throw std::runtime_error();
    h();
    // do the rest of g
    free(pc);
    }

// ... but this one won't
void f(char const *n)
    {
    file inf (n, "r");
    if (!inf.is_open())
        throw std::runtime_error();
    g(42);
    // do the rest of f
    }

int main(void)
    {
    try
        {
        f("xyzzy");
        }
    catch (catastrophic_failure &e)
        {
        // handle the error
        }
    catch (std::runtime_error &e)
        {
        // handle the error
        }
    // do the rest of main
    return 0;
    }

Function g in Listing 2 still uses a char * to manage dynamically allocated memory, so it could leak memory during unwinding. One way to avoid the leak is to wrap the call to h in its own try-block, as in:

void g(size_t n)
    {
    char *pc;
    ...
    try
        {
        h();
        }
    catch (...)
        {
        free(pc);
        throw;
        }
    // do the rest of g
    free(pc);
    }

A handler with an ellipsis as its exception-declaration catches anything. This handler doesn't really handle the exception. It just releases memory that would otherwise be lost during unwinding, and then throws again. A throw-expression without an operand can appear only in a handler. It's called a "rethrow" because it resumes throwing whatever the handler just caught.

A more common way to avoid the leak is to wrap the pointer inside a class with a destructor that frees the memory. This approach is generally more reliable, because the compiler takes care of calling the destructor automatically, whether the function returns normally or is terminated by throwing an exception.

The existence of exceptions in C++ should change the way you think about resource management. It encourages you to encapsulate resource management inside classes with destructors, rather than managing dynamic resources ad hoc, as in C.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at dsaks@wittenberg.edu. For more information about Dan click here .

Acknowledgments: Thanks to Steve Dewhurst and Joel Saks for their valuable assistance with this article.

Endnotes:
1. Saks, Dan. "Bail, return, jump, or ... throw?" Embedded Systems Design, March 2007, p. 11.
Back

2. Saks, Dan. "More About C++ Classes," Embedded Systems Programming, April 2004, p. 53.
Back

3. Dewhurst, Stephen C., C++ Gotchas. Addison-Wesley, 2003. See Item #64: Throwing String Literals.
Back

Loading comments...