Insights into member initialization

Often when it seems that C++ is generating bigger and slower code than C, it may be that C++ is actually just distributing generated code differently.

Click image to go to digital edition.

ESC Boston 2011 speaker logo Among the most common reasons that C programmers offer to explain why they're disinclined to use C++ is that C++ does too much behind the scenes. A closely-related complaint is that C++ compilers generate too much code for seemingly simple expressions. If you look online at the reader comments on my columns over the last few years,1 you'll see remarks to that effect now and then.

Most of these complaints don't hold up well under scrutiny. Often, the alleged excess code simply isn't there. For example, function overloading and friendship are strictly translation-time facilities. They don't incur any run-time costs.

At other times, excess code appears only when targeting some processors and not others. For example, some processors are better than others at calling virtual functions. Even then, the code for calling a virtual function in C++ is usually about the same as calling a function through a pointer in C.

When the complaints do have merit, it's often that C++ isn't necessarily generating bigger and slower programs than C. It may be that C++ just distributes the generated code differently. It generates more code in some places and less in others. I believe that once you understand why C++ does what it does, the resulting code not only ceases to be surprising, but even becomes predictable. Such is the case with constructors.

A constructor is a special class member function that provides guaranteed initialization for objects of its class type. Since the beginning of the year, I've been explaining what constructors are in C++ and what kind of code they generate.2, 3 This month, I'll continue by explaining the interesting behavior of constructors for classes with members that have constructors of their own. As I often do, I'll illustrate the behavior using equivalent C code.

Class objects as members

Just as a C structure can have members that are themselves structure objects, a C++ class can have members that are themselves class objects. For example, let's look at a class for entries in some kind of symbol table, where each entry stores a name and some associated information.

To keep this simple, let's just say an entry has a name, an id, and a value. The name is the textual spelling of the entry's name. The id is an unsigned integer value that uniquely identifies each entry. The value is a sequence of one or more signed integer values associated with the name. The entry class definition looks in part like:

class entry    {    ~~~private:    string name;    unsigned id;    sequence value;    };   

Here, string is a class representing a variable-length string of characters. It might be the string class from the Standard C++ Library, or it might be a class custom built for this application. The sequence class represents a sequence of signed integer values. It might be a typedef name that's an alias for a Standard Library class template instantiation, such as:

typedef vector sequence;   

Then again, it might be a custom built class.

Now let's examine the behavior of various constructors for this entry class.

Generated default constructors
As I explained in my first article on constructors, a definition for a class object can specify a constructor argument list, as in:

entry e (n, v);   

This defines e as an entry object . In this case, the compiler generates code that initializes e by calling a constructor that accepts n and v as arguments. If the entry class declares no such constructor, the compiler will blurt out nasty things.

In limited cases, the compiler may generate a constructor. For example, a definition for an object with no argument list, as in:

entry e;   

invokes a particular constructor called the default constructor. The default constructor is special in that the compiler may generate it, but only if the class has no explicitly declared constructors at all.

If the compiler generates a default constructor for class entry , that default constructor calls the default constructor for each member of class type. In this case, the default string constructor would be called for member name , and the default sequence constructor would be called for member value . A C function that performs the same initialization as the generated default entry constructor might look like:

void construct_entry(entry *_this)    {    string_construct(&_this->name);    sequence_construct(&_this->value);    }   

This function doesn't initialize the entry 's id member, which has a non-class type and thus can't have a constructor.  Generated default constructors leave such members uninitialized.

Most compilers don't generate code for a default constructor unless the program actually uses that constructor.  Calls to a generated default constructor may be expanded inline.User-defined default constructors
The generated default constructor doesn't construct entry objects properly because it doesn't initialize the id member. Uninitialized objects have indeterminate values.

Each entry should have a unique id. An easy way to implement unique ids is to obtain them from a counter that increments at each constructor call. In C++, that counter can and probably should be a private static data member, declared as:

class entry    {    ~~~private:    static unsigned counter;    string name;    unsigned id;    sequence value;    };   

In C, the counter might be a global object or a local static object.

In C++, a default constructor that provides an appropriate id value might look like:

entry::entry()    {    id = ++counter;    }   

On the surface, it looks like this constructor doesn't initialize the name and value members, but it actually does. It applies a default constructor to each member, just as a generated default constructor would. That's why they're called “default” constructors—they're the ones the program calls by default. A C function that performs the same initialization as the default entry constructor defined just above might look like:

void construct_entry(entry *_this)    {    string_construct(&_this->name);    sequence_construct(&_this->value);    _this->id = ++counter;    }   

This user-defined default constructor still might not be very useful. The default constructors for string and sequence probably create empty objects. If so, the default constructor for entry produces an object with no name and no value. You might not want such objects floating around in the application.

Non-default constructors

If you want to ensure that every entry has a non-empty name and value, then you can define an entry constructor that requires arguments for the name and value. You might declare that constructor as:

class entry    {public:    entry(string const &n, int v);    ~~~    };   

The corresponding constructor definition might look like:

entry::entry(string const &n, int v)    {    name = n;    value.push_back(v);    id = ++counter;    }   

The first statement in the constructor body assigns parameter n to entry member name using an assignment operator defined in the string class. (It actually uses a particular assignment operator known as the copy assignment . It's in my queue of things to discuss eventually. I'm also aware that the argument passed for parameter n could be an empty string, so this constructor doesn't ensure that the name will be non-empty. That's curable, but I don't want to get sidetracked on that now.)

The second statement appends the value of parameter v to the end of the sequence stored in entry member value . The Standard C++ Library containers use the name push_back for this operation, so I do, too.

Strictly speaking, the sequence 's push_back is not an initialization. It modifies the value of a previously constructed sequence object. That is, push_back operates on the assumption that sequence already has an initial value. Calling push_back appends one more value to whatever's already there.

Similarly, the string 's assignment operator is not an initialization. It replaces the value of a previously constructed string . It will likely fail if the string isn't already initialized.

Remember, entry 's members name and value have class types. Those classes have constructors. Constructors provide guaranteed initialization, meaning that each object that has a type with a constructor must be initialized by calling one of those constructors before any operations may be performed. This is true for objects even when they're members of other objects.

C++ preserves the guarantee by inserting default constructor calls for entry 's members into the entry constructor itself. Specifically, the compiler generates a call that applies the default string constructor to entry 's member name , and another call that applies the default sequence constructor to member value . A C function that performs the same work as the entry constructor might look like the code in Listing 1 :

Listing 1: A C function that performs the same work as the entry constructor.

void construct_entry_nv(entry *_this, const string *n, int v)    {    string_construct(&_this->name);    sequence_construct(&_this->value);    string_copy(&_this->name, n);    sequence_push_back(&_this->value, v);    _this->id = ++counter;    }   

In effect, this constructor initializes the entry 's name member to be empty, only to immediately replace that value with something else. Wouldn't the code be shorter and faster if it simply initialized the name member with a copy of n ? Similarly, the entry constructor initializes the value member to be an empty sequence, only to immediately append one value. Wouldn't it be better to just initialize the sequence member to hold a copy of that single value?

Member initializers

Some C programmers are disinclined to use C++ because they think it does too much behind the scenes. When they see C++ compilers generating code like that in Listing 1, they might feel their complaints are justified. If this were the end of the story, I'd agree. But it's not.

C++ extends constructors with an additional facility called member initializers. Member initializers avoid the inefficiency of unnecessary calls to default constructors by initializing members directly. Member initializers will be the subject of my next column.

Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at . For more information about Dan .

Endnotes:

  1. Programming Pointers colums are available at www.eetimes.com/electronics-blogs/27/Programming-Pointers
  2. Saks, Dan, “Demystifying constructors,” Embedded Systems Design, January/February 2011, p. 9.  www.eetimes.com/4212701
  3. Saks, Dan. “Constructors and object definitions,” Embedded.com, March 2011.  www.eetimes.com/4213712

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.