Allocating arrays - Embedded.com

Allocating arrays

In my September column, I discussed the distinction between allocating objects and allocating storage.1 A C or C++ expression such as:

pt = (T *)malloc(sizeof(T));   

allocates storage that's big enough and suitably aligned to hold an object of type T . However, it leaves the allocated storage uninitialized, so it doesn't actually create a T object in that storage. In contrast, a C++ expression of the form:

pt = new T ();   

creates a bona fide T object with a coherent initial value.

The distinction between objects and raw storage carries over when allocating array objects. C++ has native facilities to support this distinction. With a little extra effort, C programmers can preserve the distinction as well.

Allocating array storage in C
Calls to malloc commonly use a sizeof expression to specify the size in bytes of the requested storage. To allocate storage for an array, just multiply the size of each array element by the array dimension. For example:

pw = malloc(10 * sizeof(widget));   

assigns pw the address of the first widget in storage allocated for an array of 10 widget s.

The Standard C library provides calloc as an alternative way to allocate arrays. Calling calloc(n, s) allocates storage for an array of n objects, each of which is s bytes in size. As with malloc , a call to calloc returns a pointer to the allocated storage if the allocation succeeds, and returns a null pointer otherwise.

For example, to allocate an array of 10 widget s, you can call:

pw = calloc(10, sizeof(widget));   

instead of calling:

pw = malloc(10 * sizeof(widget));   

The difference between these expressions is in the resulting value of the allocated storage. As mentioned earlier, calling malloc leaves the allocated storage uninitialized. Calling calloc sets all the bits in the allocated storage to zero.

In my previous column on allocating objects,1 I explained that a C++ new-expression can have an empty initializer list, as in:

p = new T ();   

When T is a class type, this new-expression invokes T 's default constructor (the constructor that can be called without any arguments). When T is a non-class type, such as an integer or pointer type, the new-expression initializes the object with zero (just as if the object were statically allocated).

You can use calloc to initialize objects somewhat akin to the way a C++ new-expression does, but only to a very limited extent. For example, an expression such as:

pi = calloc(1, sizeof(int));   

allocates storage for a single int object and initializes that object to zero. The net effect is essentially the same as the effect of the new-expression:

pi = new int ();   

For objects of integer types (including boolean and character types), setting all bits to zero in the storage for an object has the same effect as assigning zero to that object. However, while this is almost always the case for pointer and floating-point types, the standards for C and C++ make no such guarantee.2,3 In fact, the C Standard specifically cautions against assuming that the representations of null pointer constants and floating-point zeros have all bits zero.

Thus, given this C code:

double d0 = 0.0;double d1;memset(&d1, 0, sizeof(d1));   

the comparison d0 == d1 might not yield a true result. Similarly, given this snippet of C++:

double *p0, *p1;p0 = new double ();p1 = (double *)calloc(1, sizeof(double));   

the comparison *d0 == *d1 might not yield true. In C, you shouldn't rely on calloc to properly initialize dynamically allocated objects for anything other than integer types. In C++, you probably shouldn't use calloc at all because new-expressions do the job much better.

Allocating array objects in C++
In C++, you allocate arrays using array new-expressions of the form new T [n] . Such expressions return a T * pointing to the first element in the allocated array. For example:

widget *pw;...pw = new widget [42];   

allocates an array of 42 widget s. The array dimension need not be a constant expression–it can be a computed value as in:

size_t n;// compute a value for npw = new widget [n];   

As I explained in my previous column on allocating objects,1 an ordinary (non-array) new-expression allocates memory by calling a function named operator new . An array new-expression expression uses a different function with the similar name operator new [] , pronounced as “operator new square bracket” or just “operator new bracket”. The C++ standard refers to any function named either operator new or operator new [] as an allocation function .

Each C++ environment provides a default implementation for a global operator new [] , declared as:

void *operator new[](std::size_t n)    throw (std::bad_alloc);   

The parameter list and return type are the same as for operator new : the parameter is the size (in bytes) of the storage request, and the return value is the address of the allocated storage. Like operator new , operator new [] reports an allocation failure by throwing an exception of the standard type std::bad_alloc .

For each new-expression, the compiler calculates the size of the storage request based on the allocated type. For a non-array new-expression of the form:

new T;   

the allocated type is T and the size of the request is simply sizeof(T) . For an array new-expression of the form:

new T [n];   

the allocated type is T [n] (array with n elements of type T ), and the size of the request is at least n * sizeof(T) . That is, the compiler may ask for additional bytes, so the actual size of the request might be n * sizeof(T) + k for some small positive k . The runtime system uses those additional k bytes to store information, such as the array dimension, that it needs to delete the array.

I believe that many C++ compilers use the same allocation algorithm for the default implementations of both operator new and operator new [] . However, you can define you own allocation functions to replace the ones provided by the compiler. If you wish, you can employ one allocation scheme for individual objects and a different allocation scheme for arrays. You don't have to rewrite or even recompile your code to take advantage of a new allocation scheme. Just relink your code with a different compiled definition for operator new and/or operator new [] , and the existing new-expressions will employ the new allocation function(s) automatically.

Each new-expression is conceptually, if not actually, a two step process: (1) allocate storage for an object, and (2) initialize that object. For objects of class types, initializing the objects involves calling a constructor. For objects of non-class type, initialization may involve copying an initial value into the storage, or it may involve nothing at all. For example:

pi = new int (13);   

stores a 13 into the allocated integer, whereas:

pi = new int;   

leaves the allocated integer with an indeterminate value.

An array new-expression is also a two step process, but the second step is a bit more elaborate than in a non-array new-expression. Whereas a (non-array) new-expression initializes a single object, an array new-expression initializes every array element.

For example, if widget is a class type with a default constructor, then an array new-expression, such as either:

pw = new widget [n];   

or:

pw = new widget [n] ();   

translates more-or-less into something like:

pw = static_cast    (operator new [](n * sizeof(widget)));for (widget *p = pw; p != pw + n; ++p)    p->widget();   

The first statement acquires storage for an array of n widget s by calling operator new [] , and assigns the address of that storage to pw . The subsequent loop applies the default constructor to each widget in that array of widget s. (The statement in the body of the loop–an explicit constructor call–is not something you can actually write in C++.)

When the element type is not a class type, an array new-expression with an empty initializer list initializes each array element with zero (converted to the element type). Thus, an array new-expression such as:

pi = new int [n] ();   

translates more-or-less into:

pi = static_cast    (operator new [](n * sizeof(int)));for (int *p = pi; p != pi + n; ++p)    *p = 0;   

When the element type is not a class type, an array new-expression with no initializer list leaves each array element uninitialized. Thus, an array new-expression such as:

pi = new int [n];   

translates more-or-less into just:

pi = static_cast    (operator new [](n * sizeof(int)));   

The current (2003) C++ standard doesn't allow an array new-expression to have a non-empty initializer list. For example, an expression such as:

pi = new int [n] (7);   // compile error   

won't compile. This means you can't allocate an array of int s and initialize all the elements to the same non-zero value. The new draft of the C++ standard includes other initialization options, but not this one.

Allocating array objects in C
In my previous column on allocating objects,1 I showed that you can approximate classes with constructors by using structs and functions. For example, you can implement a C++ widget class as a C struct:

typedef struct widget widget;struct widget{    // widget data members go here};   

with a “default constructor” implemented as a C function:

void widget_construct(widget *w);   

You can mimic the behavior of a C++ new-expression by using a single inline function:

inlinewidget *new_widget(){    widget *pw = malloc(sizeof(widget));    if (pw != NULL)        widget_construct(pw);    return pw;}   

Then you can construct a dynamically allocated widget with a default initial value using just:

pw = new_widget();   

which is a pretty good approximation for the C++ new-expression:

pw = new widget;   

You can extend this approach to include a function that allocates properly-initialized arrays:

widget *new_widget_array(size_t n){    widget *pw = malloc(n * sizeof(widget));    if (pw != NULL)    {        widget *p;        for (p = pw; p != pw + n; ++p)            widget_construct(p);    }    return pw;}   

Then you can create a dynamically-allocated array of default-initialized widget s by writing:

pw = new_widget_array(n);   

This is a pretty good approximation for the C++ array new-expression:

pw = new widget [n];   

Bear in mind that this new_widget_array function indicates allocation failures by returning NULL . Your C code should diligently check for possible NULL return values.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at . For more information about Dan .

Endnotes:
1. Saks, Dan, “Allocating objects vs. allocating storage,” Embedded Systems Design , September, 2008, p. 11.

2. ISO/IEC Standard 9899:1999, Programming languages–C.

3. ISO/IEC Standard 14882:2003, Programming languages–C

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.