Allocating objects vs. allocating storage

In my July column, I explained that Standard C and C++ offer somewhat different facilities for allocating and deallocating dynamic memory.1 C provides a small collection of memory management functions: malloc , calloc , free , and realloc . Although C++ also provides these functions (for compatibility with C), C++ offers the new and delete operators as an arguably better alternative.

In C, you typically allocate dynamic memory for an object of type T by using an expression of the form:

pt = malloc(sizeof(T));   

where pt is presumably declared as a “pointer to T “. In C++, you typically use a new-expression instead of calling malloc , as in:

pt = new T;   

The differences between these two notations are not just superficial. C and C++ handle dynamic allocation in a fundamentally different way: whereas malloc allocates raw storage of indeterminate value, new can create objects of abstract types with coherent initial values. Using new-expressions instead of malloc reduces the possibility of runtime errors arising from questionable pointer conversions and improper initializations.

This month I'll explain how new-expressions interact with constructors and allocation functions in C++. I'll also explain how C programmers can employ a style of memory allocation that derives much of the benefit of using new-expressions.

New-expressions and constructors
In C++, a constructor is a special class member function that initializes objects of its class type. A constructor's function name is always the same as its class name, as in:

class widget    {public:    widget(); // a constructor    ...    };   

Constructors provide guaranteed initialization for class objects. Although you declare constructors when you define a class, you don't write calls to those constructors–the compiler generates them for you. Whenever you define an object with a class type, the compiler automatically plants a call to the object's constructor at the right place in the program.

For guaranteed initialization to really be guaranteed, the compiler must generate a call to a constructor wherever the source code creates an object, including in new-expressions. Thus, for a class type such as widget , a new-expression such as in:

pw = new widget;   

doesn't just allocate storage for a widget ; it applies widget 's constructor to that storage to produce a properly constructed object.

The primary reason to avoid calling malloc in C++ is that doing so voids the initialization guarantee. Although the conventional C style for calling malloc uses a sizeof expression applied to a type, as in:

pw = malloc(sizeof(widget));   

the call actually just passes an integer (the size of the type in bytes). Since malloc never knows the type of whatever it's allocating, it always returns the address of the allocated storage as a void * . Thus, the compiler loses the compile-time type information it needs to choose a constructor.

As I explained in my previous column on dynamic allocation, the conventional C style for calling malloc provokes a warning or an error message when compiled as C++.1 If you want the assignment to compile in C++, you must use a cast, as in:

pw = static_cast     (malloc(sizeof(widget)));   

Casting the pointer result from void * to widget * doesn't affect the contents of the allocated storage. It just forces the compiler to yield to your request to change the pointer's type. Executing this statement leaves pw pointing to an uninitialized widget .

You could say that it's the cast, not malloc , that voids the initialization guarantee. I wouldn't argue with you. However, it's nonetheless difficult to use malloc in C++ without a cast, so the conclusion is still the same: prefer new-expressions to malloc calls.

As with almost any other function, a constructor can have parameters, possibly many. For example, this widget class has a constructor with a single parameter of type int :

class widget    {public:    widget(int n);  // a constructor    ...    };   

Using this class, a new-expression such as:

pw = new widget;   

won't compile because widget 's constructor now requires an argument, which this new-expression doesn't provide. In this case, you must provide a parenthesized argument list after the type name in the new-expression, as in:

pw = new widget (v);   

This new-expression passes v as the argument to the constructor for the newly-allocated widget object.

C++ lets you overload constructors. That is, you can declare more than one constructor within a single class. Each constructor in a set of overloaded constructors must have a sufficiently distinct parameter list so the compiler can tell them apart. (I explained function overloading in one of my earliest columns.2)

For example, this widget class has four constructors:

class widget    {public:    widget();    widget(int i);    widget(double d);    widget(char const *p, size_t n);    ...    };   

For each new-expression, the compiler selects the constructor whose parameter list is the best match for the constructor argument list. For example:

pw = new widget (4);   

uses the second constructor (the one that has a single parameter of type int ), and:

pw = new widget ("xyzzy", 8);   

uses the fourth constructor (the one that has two parameters).

If there's no match, the compiler rejects the new-expression. For example:

pw = new widget (1, 2, 3);   

won't compile because no widget constructor accepts three arguments.

Even though non-class types such as int and double don't really have constructors, you can supply a single “constructor” argument when allocating a non-class object. For example:

pi = new int (42);   

allocates an int object and initializes it with the value 42. This new-expression has the same effect as if it were written as:

pi = new int;*pi = 42;   

Thus, the parenthesized argument list after the type in a new-expression is an initializer list, not merely a constructor argument list.

The initializer list in a new-expression can be empty. For example:

pw = new widget ();   

is valid. For a class type such as widget , an empty initializer list means the same as no list at all, as in:

pw = new widget;   

Either way, the new-expression invokes widget 's default constructor–the constructor that can be called without any arguments.

For non-class types, there's a subtle difference between these two forms of new-expression. In particular, if T is a non-class type, then:

pt = new T;   

leaves the allocated object uninitialized, while:

pt = new T ();   

initializes the object with zero (as if the object were statically allocated).

New-expressions and operator new
A new-expression allocates memory by calling a function named operator new , rather than by calling malloc . Each C++ environment provides a default implementation for a global operator new , declared as:

void *operator new(std::size_t n)    throw (std::bad_alloc);   

As with malloc , the argument to operator new is the size (in bytes) of the storage request, and the return value is the address of the allocated storage. However, operator new reports failure differently from malloc .

Whereas malloc returns a null pointer if it can't allocate the requested storage, operator new throws an exception.3 The exception specification :

throw (std::bad_alloc)   

at the end of the function heading indicates that operator new will allow only exceptions of the standard type std::bad_alloc to propagate. That is, operator new may throw exceptions of various types, but it will catch those other exceptions before they can escape to the calling environment. Only exceptions of type std::bad_alloc may propagate from operator new to its caller.

Thus, for a class type T , a new-expression such as:

pt = new T (v);   

translates more-or-less into something like:

p = static_cast(operator new(sizeof(T)));p->T(v);   

The first statement acquires storage for a T object by calling operator new and converts the address of that storage from type void * to type T * . The second initializes the storage by applying a constructor that will accept argument v . (That second statement–an explicit constructor call–is not something you can actually write in C++.)

Allocating objects in C
Although C doesn't have classes with constructors, you can–and many C programmers do–approximate them by using structs and functions.4, 5 For example, you can implement a C++ widget class as a C struct:

typedef struct widget widget;struct widget    {    // widget data members go here    };   

(The typedef immediately before the struct definition elevates the name widget from a mere tag to a full-fledged type name.6)

You can implement each widget class member function in C++ as a non-member function in C whose first parameter is a pointer to the widget to be manipulated, possibly along with other parameters. For example, you might declare the C implementation of the widget default constructor as:

widget_construct(widget *w);   

Then you can mimic the behavior of a C++ new-expression such as:

pw = new widget;   

as the C statements:

pw = malloc(sizeof(widget));if (pw != NULL)    widget_construct(pw);   

Even better, you can fold these statements into a single inline function:

inlinewidget *new_widget()    {    widget *pw = malloc(sizeof(widget));    if (pw != NULL)        widget_construct(pw);    return pw;    }   

Then you can construct a dynamically allocated widget with a default initial value using just:

pw = new_widget();   

which is a pretty good approximation for the C++ new- expression:

pw = new widget;   

Providing additional constructors is less convenient in C than it would be in C++, but still doable. For example, if you want to initialize a widget with an integer value, you need a constructor such as:

widget_construct_i(widget *w, int i);   

along with a widget allocator such as:

widget *new_widget_i(int i)   

Then you can create a dynamically allocated widget and initialize it with an int by writing:

pw = new_widget_i(v);   

This is a pretty good approximation for the C++ new- expression:

pw = new widget (v);   

Bear in mind that, while new-expressions throw exceptions on allocation failures, the new_widget functions return NULL . Your C code should diligently check for possible NULL return values.

I'll have more to say about memory allocation and deallocation in future columns.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at Dan also welcomes your feedback: e-mail him at . For more information about Dan .

1. Saks, Dan,
“Dynamic allocation in C and C++,” Embedded Systems Design , July 2008, p. 11.
2. Saks, Dan, “Function Name Overloading,” Embedded Systems Programming , May 1999, p. 17.
3. Saks, Dan, “Throw and catch,” Embedded Systems Design , May 2007, p. 11.
4. Saks, Dan, “Abstract Types Using C,” Embedded Systems Programming , November 2003, p. 39.
5. Saks, Dan, “Incomplete Types as Abstractions,” Embedded Systems Programming , December 2003, p. 43.
6. Saks, Dan, “Tag vs.Type Names,” Embedded Systems Programming , October 2002, p. 7.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.