Dynamic allocation in C and C++

I recently presented arguments for and against using dynamic memory allocation in C and C++ programs.1 I do agree that truly safety-critical systems should avoid using dynamic allocation because the associated risks outweigh the advantages. However, I strongly suspect that many other embedded systems could be improved by judiciously using dynamic memory allocation. In places where the standard C or C++ allocation and deallocation functions are less than ideal, a customized memory manager might work much better.

Ideally, a customized memory manager should look and act like a standard one to the greatest extent possible. Although the customized allocation policy will be different from the policy of a standard one (that's the whole point of using a custom scheme), the parameters and return values of the customized allocation and deallocation functions should be as similar as possible to the standard ones. This similarity lets you build and test your code using standard memory management functions and then slip customized functions in or out as needed with minimal fuss.

Understanding the ins and outs of dynamic memory management is especially worthwhile for C++ programmers. C++ offers many facilities–most notably classes with constructors and destructors–that dramatically diminish the incidence of memory leaks. C++ also lets you define allocation and deallocation functions for each class, making it remarkably easy to insert customized memory managers into existing code. You can even use allocation functions to place objects representing device registers at memory-mapped locations.

This month, I'll contrast the standard allocation and deallocation facilities in C with those in C++. Understanding the differences between these facilities is insightful regardless of which language you use.

The standard C functions
Standard C provides two memory allocation functions, malloc and calloc , and one deallocation function, free . A fourth function, realloc , does both deallocation and allocation. All four functions are declared in the standard header .

The storage for an object allocated using an allocation function (malloc , calloc , or realloc ) remains allocated until the program passes the object's address as an argument to a deallocation function (free or realloc ). Attempting to deallocate storage that isn't currently allocated produces undefined behavior. (I explained the distinctions among undefined, unspecified, and implementation-defined behavior in earlier columns.)1, 2

The standard declares malloc as:

void *malloc(size_t size);   

Calling malloc(s) allocates storage for an object whose size is s bytes. If the allocation succeeds, the call returns a pointer to the allocated storage. If it fails, the call returns a null pointer.

Parameter size has type size_t . As I explained in a column last year, size_t is a typedef declared in as well as several other headers. The type is an alias for some unsigned integer type, typically unsigned int or unsigned long , or possibly even unsigned long long .3 Each Standard C implementation is supposed to choose the unsigned integer that's big enough–but no bigger than needed–to represent the size of the largest possible object on the target platform.

The argument in a typical call to malloc is a sizeof expression. (A sizeof expression yields a value of type size_t .) For example, if p is a pointer to a widget , then:

p = malloc(sizeof(widget));   

assigns p the address of a dynamically allocated widget . Or:

p = malloc(10 * sizeof(widget));   

assigns p the address of the first element in a dynamically allocated array of 10 widgets.

The calloc function is an alternative to malloc declared in the standard as:

void *calloc(size_t nmemb, size_t size);   

Calling calloc(n, s) allocates storage for an array of n objects, each of which is s bytes in size. As with malloc , a call to calloc returns a pointer to the allocated storage if the allocation succeeds and returns a null pointer otherwise. Thus, to allocate an array of 10 widgets, you can call either:

p = malloc(10 * sizeof(widget));   

or:

p = calloc(10, sizeof(widget));   

The only substantive difference between these expressions is the value of the allocated storage. After calling malloc , the value in the allocated storage is indeterminate , which is the standard's way of saying the storage contains whatever residual values were left there by the previous occupant. After calling calloc , all bits in the allocated storage are set to zero.

The realloc function reallocates an object as an object of a possibly different size. The standard declares it as:

void *realloc(void *ptr, size_t size);   

If p is a null pointer, then calling realloc(p, s) yields the same result as calling malloc(s) . Otherwise, the call deallocates the “old” object whose address is p and returns a pointer to a “new” object with size s . The contents of the new object will be the same as what was in the old object prior to the call, up to the lesser of the old object's size and the new object's size. Any bytes in the new object beyond the size of the old object will have indeterminate values.

As with the other allocation functions, a call to realloc returns a pointer to the allocated storage if the allocation succeeds; otherwise, it returns a null pointer without deallocating the old object.

Inasmuch as size_t is always an unsigned type, malloc and calloc always interpret their size arguments as a non-negative values. However, they might be zero. For example, if the n in:

p = malloc(n * sizeof(widget));   

is zero, the call attempts to allocate a zero-sized array. Whether the call returns a null or non-null pointer is implementation-defined. In either event, dereferencing that pointer would yield undefined behavior.

The standard declares free as:

void free(void *ptr);   

If p is a null pointer, calling free(p) does nothing. Otherwise, the call deallocates the storage for the object addressed by p .

NEW- and DELETE- expressions in C++
The Standard C allocation and deallocation functions are also available in C++, but C code using the allocation functions doesn't always compile in C++. For example, if p is a pointer to a widget , the conventional C usage:

p = malloc(sizeof(widget));   

provokes a compile-time diagnostic (either a warning or an error message) when compiled as C++. The complaint is that the expression attempts an invalid conversion from void * to widget * . As I explained in my most recent online-only column, the conversion is valid in C, but not in C++.4

If you want the allocation to compile in C++, you must use a cast, as in:

p = (widget *)malloc(sizeof(widget));   

Using a “new-style” cast is arguably better:5

p = static_cast (malloc(sizeof(widget)));   

Even better still, C++ avoids the need to use casts in memory allocation expressions by providing a memory allocation operator, called new , which handles the type conversion implicitly. A new-expression of the form new T allocates storage for a T object and returns a T * pointing to that object. For example, the new-expression in:

p = new widget;   

allocates storage for a widget object and returns a widget * pointing to that object. If p is a pointer to a widget , the assignment compiles without complaint.

Aside I described earlier, in C you can allocate an array of n objects of type T using either:

p = malloc(n * sizeof(T));   

or:

p = calloc(n, sizeof(T));   

In C++, you allocate an array using an array new-expression of the form new T [n] . This expression returns a T * pointing to the first element in the allocated array.

Note that both new T and new T [n] return a T * . In C++, as in C, a pointer to T can point to either a single T object or to the first element in an array of T objects.

In C, you use the standard free function to deallocate storage acquired by calling malloc , calloc , or realloc . In C++, you use the delete operator to deallocate storage acquired by new . For example, the delete-expression in:

p = new T;...delete p;   

deallocates a single object of type T .

C++ provides an alternative form of delete-expression for deallocating arrays. An array delete-expression uses the keyword delete followed by a set of empty square brackets, as in:

p = new T [n];...delete [] p;   

>This array delete-expression deallocates the array to which p points.

As with free , deleting a null pointer is harmless. This is true for both delete-expressions and array delete-expressions.

If you allocate a single object using a (non-array) new-expression, you shouldn't try to deallocate it with an array delete-expression, as in:

p = new T;...delete [] p;    // error   

Similarly, if you allocate an array using an array new-expression, you shouldn't try to deallocate it with a (non-array) delete-expression. For example:

p = new T [n];...delete p;       // error   

A C++ compiler often can't detect these errors. A pointer to T can point to either a single T object or to the first element in an array of T objects. If the new- and delete-expressions are in different scopes, the compiler can't tell from the pointer operand in a delete-expression whether it points to a single object or an array element. It relies on the presence or absence of the [] to indicate what the pointer is pointing to. If you get it wrong, the delete-expression has undefined behavior. As with other undefined behaviors, the program may do what you were hoping it would do, but don't count on it.

In general, C++ applies stricter type checking than does C in an effort to weed out more errors at compile time, but here we have a situation where C++ invites errors that you just can't make in C:

• In C++, it's the programmer's responsibility to choose the correct form for each delete-expression (for a single object or array) by including or omitting [] .

• In C, you simply call free(p) whether p points to a single object or to an array.

If C++ really did place a greater premium on type safety than C does, then wouldn't C++ simply avoid these errors by providing just one form for delete-expressions?

Although C++ places great emphasis on type safety, it's not to the exclusion of other concerns, such as performance. In this case, C++ trades a little type safety for potentially significant performance improvements in memory allocation. As I'll explain in an upcoming article, this design for new- and delete-expressions makes it easier to implement and use efficient customized memory managers.

Initializing allocated objects
In C++, a class constructor is a special class member function that initializes objects of its class type. A constructor's function name is always the same as its class name, as in:

class widget    {public:    widget();       // a constructor    ...    };   

Constructors provide guaranteed initialization for class objects. You don't write calls to constructors–the compiler generates them for you. Whenever you define an object with a class type, the compiler automatically plants a call to the object's constructor at the right place in the program.

For guaranteed initialization to really be guaranteed, the compiler must generate a call to the appropriate constructor wherever the source code creates an object, including in new-expressions. Thus, for a class type such as widget , a new-expression such as in:

p = new widget;   

doesn't just allocate storage for a widget ; it applies a constructor to that storage to produce a properly constructed widget object.

C and C++ handle dynamic allocation in a fundamentally different way: Whereas malloc just allocates storage of indeterminate value, a new-expression can create a properly initialized object.

What about using calloc instead of malloc ? Doesn't calloc initialize objects to zero? Indeed, for some types, setting all bits to zero is a reasonable and useful initial value, but for many class types, it isn't. In fact, as the C standard notes, all bits zero might not even be the proper representation for a floating-point zero or a null pointer.

C++ classes use special member functions called destructors to provide automatic resource deallocation. A destructor's name is the same as the constructor's name, but with a ~ in front of the name, as in:

class widget    {public:    widget();       // a constructor    ~widget();      // a destructor    ...    };   

Just as new-expressions invoke constructors, delete-expressions invoke destructors. I will elaborate the connection between new-expressions and constructors, and between delete-expressions and destructors, in a forthcoming column.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at . For more information about Dan .

Endnotes:
1. Saks, Dan. “The yin and yang of dynamic allocation,” Embedded Systems Design , May, 2008, p. 12. Available online at www.embedded.com/207402546.

2. Saks, Dan. “As Precise as Possible,” Embedded Systems Programming , April, 2002, p. 43. Available online at www.embedded.com/9900563.

3. Saks, Dan. “Why size_t matters,” Embedded Systems Design , July, 2007, p. 13. Available online at www.embedded.com/200900195.

4. Saks, Dan. “Into, but not out of, the void,” Embedded.com, June 2008, www.embedded.com/208403407

5. Saks, Dan. “Cast with caution,” Embedded Systems Design , July, 2006, p. 15. Available online at www.embedded.com/191600535.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.