Variations on a flexible array theme - Embedded.com

Variations on a flexible array theme

Some months ago, I described how C++ run-time systems dynamically allocate and deallocate arrays whose elements are class objects with constructors and destructors.1 I also showed how you can implement similar behavior in C by writing your own array allocation and deallocation functions.

More recently, I presented a technique you can use to ensure that the elements in dynamically allocated arrays are properly aligned.2 Unfortunately, under some conditions, that technique wastes a little storage. This month, I'll show you a different, arguably better, approach to solving such alignment problems, which avoids that waste.

What I'm about to show you is not just about eliminating a few bytes of unnecessary padding. It's been my observation that too many embedded systems developers lay out varying length structures as arrays of (unsigned) characters and compute offsets using ad hoc pointer arithmetic and casts. The resulting code is usually very brittle–hard to get right and easy to break. I think you're much better off using structures that lay out your data as accurately as possible, and let the compiler do the alignment and padding. The resulting code will be easier to read and maintain.

Where we were
As before, I'll illustrate the problem by using arrays of widgets. In C++, you use an array new-expression such as:

pw = new widget [n];   

to allocate an array with n properly initialized widget s. In C, you can implement an allocation function named new_widget_ array and call it in an expression such as:

pw = new_widget_array(n);   

In C++, you use an array delete-expression such as:

delete [] pw;   

to deallocate an array previously allocated by an array new-expression. In C, you can implement a deallocation function named delete_widget_array and call it in an expression such as:

delete_widget_array(pw);   

Each call to delete_widget_ array passes only one argument: the base address of the allocated array. However, the function needs to know the array dimension as well. One way to solve this problem is for new_widget_array to stash the array dimension in a location just before each array that it allocates. Then delete_widget_array can use a little pointer arithmetic to locate and extract the array dimension residing just before the array to be deallocated.

The new_widget_array function uses the Standard C malloc function to allocate the dynamic storage. Each call to malloc must return a pointer that's aligned to the most strictly aligned object that could be placed in the allocated storage. If you allocate storage only for an array of widgets, the storage for the array elements will be properly aligned. However, as I explained in an earlier column, if you allocate storage for an array plus an additional integer to hold the array dimension, and you place that integer before rather than after the array, the array elements could be misaligned.

I suggested solving this problem by using a union defined as:

typedef union widget_array_header widget_array_header;union widget_array_header    {    size_t size;    max_align_t unused;    };   

where max_align_t is a type that's at least as strictly aligned as every other type.2 Instead of allocating storage for a size_t plus an array of widgets, the new_widget_array function should allocate storage for a widget_array_header plus the array, as shown in Figure 1 .


Click on image to enlarge.

The size of the padding after the size_t member of the widget_array_header is the difference, if any, between sizeof(size_t) and sizeof(max_align_t) .

Using the widget_array_header , the new_widget_ array function looks like the code in Listing 1 .


Click on image to enlarge.

This approach avoids alignment errors. However, as I mentioned earlier, it might waste a little storage. In particular, if the alignment of widget is less than the alignment of max_align_t , the padding between the size_t object and the first widget array element in Figure 1 will be larger than necessary.

For example, suppose this code is compiled for a platform in which a size_t is a four-byte object with an alignment of four and a max_align_t is an eight-byte object with an alignment of eight. Then the size and alignment of the widget_array_header will be eight as well, and the first widget array element will have an offset at least eight bytes from the beginning of the allocated storage. However, if widget has an alignment of only four, forcing the first widget to an eight-byte boundary is overkill. It wastes four bytes.

Structures with a “variable length array” member
The simplest, most direct way to lay out properly aligned storage for a widget array preceded by a single size_t object is to dispense with the widget_array_header and use instead a structure that maps the entire storage allocation. The GNU C compiler offers a very nice way to define this structure, as:

typedef struct widget_array widget_array;struct widget_array    {    size_t size;    widget array[n];    };   

Unfortunately, as I'll explain in detail shortly, it's nonstandard. Nonetheless, it lays out the storage very nicely, so it's worth exploring for a moment.

Using a structure such as this, you can reasonably expect that the compiler will insert padding between the size and array members only as needed to align the first array member. If widget has the same (or less strict) alignment than size_t , the compiler need not insert any padding at all.

The dimension, n , of the structure's array member corresponds to the number of elements in the array allocation request. That is, the n appearing in the structure definition should have the same value as the n passed to a call of the form:

pw = new_widget_array(n);   

The number of elements requested can vary with each array allocation request. The best way I know to guarantee that the array dimension of the structure member matches the number of elements in each different request is to declare the structure local to the array allocation function, as in Listing 2 .


Click on image to enlarge.

The beauty of this approach is that it simplifies the computation of the size of the allocated storage from:

sizeof(widget_array_header) + n * sizeof(widget)   

to just:

sizeof(widget_array)   

It also eliminates the cast used to compute the base address of the widget array.

As I mentioned earlier, the problem with this implementation is that it's neither Standard C nor Standard C++. Of the half-dozen or so compilers I have installed on my machine, only the GNU C compiler can compile and execute the code above. (I tested the code using the gcc 3.4.5 and 4.4.0 as distributed for MinGW.)

Both C89 and Standard C++ insist that the dimension in an array declaration must be a constant expression, which the n in the member declaration:

        widget array[n];   

is not.3, 4 C99 lets you declare variable length arrays (VLAs) with non-constant dimensions, as in:

widget *new_widget_array(size_t n)    {    widget array[n];    ~~~    };   

but only if the declaration names an ordinary identifier (not a structure or union member) at function prototype scope or block scope.5

Structures with a flexible array member
In C99, you can use a structure whose last member is an array with an unspecified dimension, as in:

typedef struct widget_array_header widget_array_header;struct widget_array_header    {    size_t size;    widget array[];    };   

Such a structure member is called a flexible array member .

The size of a structure with a flexible array member is the size of everything in the structure, including any padding, up to but not including the flexible array member. For example, if the size of a size_t is four and the alignment of a widget is also four, then the compiler will likely insert no padding after the size member of widget_ array_header , and sizeof(widget_array_header) will be just four. This eliminates the excess padding you might get using a widget_array_header that's a union containing a max_align_t .

Using a structure with a C99 flexible array member, the new_widget_array function looks much as it did before, as shown in Listing 3 .


Click on image to enlarge.

The computation of the size of the allocated storage is back to its more complicated form:

sizeof(widget_array_header) + n * sizeof(widget)   

but the function doesn't need a cast to compute the base address of the widget array.

The disadvantage of this approach is that only C99 supports it. You can't use it with C++ or with older dialects of C.

Structures with a single element array member
In C++ and C89, you can use a structure whose last member is an array with a single element, as in:

typedef struct widget_array_header widget_array_header;struct widget_array_header    {    size_t size;    widget array[1];    };   

You can use this structure almost as if it had a flexible array member. You just need to compensate for the size of the extra widget in widget_array_header when computing the size of the allocated storage. When using a flexible array member, the formula is:

sizeof(widget_array_header) + n * sizeof(widget)   

When using a single element array member, the formula is:

sizeof(widget_array_header) + (n – 1) * sizeof(widget)   

Aside from this, the implementation of the new_widget_array function is the same as it was when using a flexible array member.

A single object of type widget has the same size and alignment as an array with one widget element. Thus, declaring the trailing member of the structure as a single widget object rather than as a single-element widget array, as in:

struct widget_array_header    {    size_t size;    widget array; // not array[1]    };   

doesn't alter the layout of the structure. Nonetheless, a single-element array is a better surrogate for an n-element array than is a single object. For example, in the body of the new_widget_array function, the statement:

pw = pwah->array;   

is the same whether array is a C99 flexible array member or a single-element array. However, if you declare the array member as a single object, then you must rewrite that statement using an explicit & (address-of) operator, as in:

pw = &pwah->array;   

The bottom line
I generally recommend writing declarations that model the data in your program as accurately as possible. I find this leads to code that's easier to get right in the first place, and easier to read and maintain down the road. I usually find code with accurate declarations easier to port as well. Unfortunately, on some occasions such as the example just discussed, the most accurate declarations you can write use non-standard extensions, which can hamper portability.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at . For more information about Dan .

Endnotes:
1. Saks, Dan. “Allocating and deallocating arrays, in greater detail,” Embedded Systems Design , September, 2009, p. 11. www.embedded.com/219500398

2. Saks, Dan. “Computing properly aligned pointer values,” Embedded Systems Design , November, 2009, p. 10. www.embedded.com/220900317

3. ISO/IEC Standard 9899:1990, Programming languages–C.

4. ISO/IEC Standard 14882:2003(E), Programming languages–C++.

5. ISO/IEC Standard 9899:1999(E), Programming languages–C

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.