Symbolic Constants

November 01, 2001

Dan_Saks-November 01, 2001

Symbolic Constants
There's more than one way to define symbolic constants in C and C++. It helps to know what all of your choices are.

One of the first style guidelines that most programmers learn is to use symbols, rather than raw numbers, to represent arbitrary constant values. For example, rather than write:

char buffer[256];
...
fgets(buffer, 256, stdin);

you should define a symbol, say buffer_size, representing the number of characters in the buffer, and use the symbol instead of the literal, as in:

char buffer[buffer_size];
...
fgets(buffer, buffer_size, stdin);

C and C++ offer a number of different ways to define such symbols. This month, I'll show you what your choices are.

Macros

C programmers typically define symbolic constants as macros. For example, the code:

#define buffer_size 256

defines buffer_size as a macro whose value is 256. The macro preprocessor is a distinct compilation phase. The preprocessor substitutes macros before the compiler does any other symbol processing. For example, given the macro definition just above, the preprocessor transforms:

char buffer[buffer_size];
...
fgets(buffer, buffer_size, stdin);

into:

char buffer[256];
...
fgets(buffer, 256, stdin);

Later compilation phases never see macro symbols such as buffer_size; they see only the source text after macro substitution. Therein lies the source of a minor irritation that comes with using macros: many compilers don't preserve macro names among the symbols they pass on to their debuggers. Macros have an even more serious problem: macro names don't observe the scope rules that apply to other names. For example, you can't restrict a macro to a local scope:

void foo()
{
#define max 16 // non-local
int a[max];
...
}

Here, max is not local to function foo. It's effectively global. You can't declare a macro as a member of a C++ class or namespace.

In a sense, macro names are more pervasive (read "worse") than global names. Global names can be hidden by names in inner scopes. Macros don't even respect inner scopes. Consequently, macros might substitute in places you don't want them to. For example, after macro substitution:

#define max 16
...
void sort(int a[], size_t max);

becomes:

void sort(int a[], size_t 16);

which is a syntax error. Unfortunately, such inadvertent macro substitution doesn't always produce a compiler diagnostic; and even when it does, the message may be puzzling.

Since macro names don't behave like other names, most C and C++ programmers adopt a naming convention for macros to distinguish them from all other kinds of names. The most common convention is to spell macro names entirely in uppercase, as in:

#define BUFFER_SIZE 256
char buffer[BUFFER_SIZE];
...
fgets(buffer, BUFFER_SIZE, stdin);

Enumeration constants

Both C and C++ offer alternatives that avoid the ill effects of macros. One of these alternative is the use of enumeration constants.

An enumerated type definition can define a type along with associated constant values of that type. For example:

enum color { red, green, blue };

defines an enumeration type color and constants red, green, and blue of type color. By default, red has the value 0, green has the value 1, and blue has the value 2. However, you can define the constants with values other than their defaults, as in:

enum color
{ red = 1, green = 2, blue = 4 };

Most parts of an enumeration definition are optional, including the type name. For example:

enum { blue = 4 };

omits the type name and all but one enumeration constant. It simply defines a constant named blue whose value is 4.

You can use this simplified form of enumeration definition to define any integer-valued constant, such as:

enum { buffer_size = 256 };

This defines buffer_size as the integer constant 256. An enumeration constant is a compile-time constant, so you can use it as an array dimension, as in:

char buffer[buffer_size];

Unlike macros, enumeration constants do obey the usual scope rules. This means that you can declare enumeration constants local to functions, or, in C++, as members of classes or namespaces.

Unfortunately, enumeration constants must have integer values, so you can't use them for floating constants, as in:

// truncates to 3
enum { pi = 3.14159 };

Such truncations typically produce a warning from the compiler.

const objects

Both C and C++ offer yet another way to define a symbolic constant-as a const object, such as:

int const buffer_size = 256;

The order in which you write int and const doesn't matter to the compiler. You can just as well write the declaration as:

const int buffer_size = 256;

For reasons I've explained in the past, I prefer writing const to the right of the type, as in int const. (See "const T vs. T const," February 1999, p. 13.)

Unfortunately, the above definition (whether you write it one way or the other) has a different meaning in C than it does in C++. In C++, the name of a const object is a compile-time constant expression. In C, it is not. Thus, a C++ program can use buffer_size as an array dimension, while a C program cannot. For instance, the following definition:

int const buffer_size = 256;

compiles equally well in either C or C++, the subsequent definition:

char buffer[buffer_size]; // ??

compiles only in C++. It produces a compile error in C. It's my impression that most C++ programmers prefer defining symbolic constants as const objects rather than as enumeration constants. If nothing else, a const object definition looks more like what it is. For example, when you write:

int const buffer_size = 256;

the definition says fairly explicitly that buffer_size is an "integer constant" whose value is 256. It's not nearly so clear that:

enum { buffer_size = 256 };

is essentially the same thing.

Another, more substantive, advantage is that a constant object definition lets you specify the exact type of the constant. For example:

unsigned int const buffer_size = 256;

defines buffer_size as a constant whose type is unsigned int rather than plain int (which is signed by default). In contrast:

enum { buffer_size = 256 };

defines buffer_size as a plain int. It's a plain int even if you specify the constant's value using an unsigned literal, as in:

enum { buffer_size = 256u };

As I explained last year, a numeric literal with the suffix u or U has an unsigned integer type. (See "Numeric Literals," September 2000, p. 113.) In this case, 256u has type unsigned int. However, regardless of the exact type used to specify the enumeration constant's value, the enumeration constant has type int if the value can be represented as an int.

Most of the time, the exact type of a symbolic integer constant doesn't matter. For example, whether you define buffer_size as:

unsigned int const buffer_size = 256;

or as:

enum { buffer_size = 256 };

an array declared in C++ as:

char buffer[buffer_size];

has 256 elements. The only time it matters is on those rare occasions when you pass the constant as an argument to one member of a family of overloaded functions. For example, given:

int f(int i);
unsigned int f(unsigned int ui);

the way in which you define buffer_size affects which function f(buffer_size) calls. If buffer_size is an enumeration constant, f(buffer_size) calls f(int). If buffer_size is an unsigned int constant, it calls f(unsigned int).

But I like enumeration constants

Despite the disadvantages that I just mentioned, I generally prefer defining symbolic constants as enumeration constants rather than as const objects. The problem with const objects is that they may incur a performance penalty, which enumeration constants avoid. That should keep you on the edge of your seat until next time. See you then.

Dan Saks is the president of Saks & Associates, a C/C++ training and consulting company. He is also a consulting editor for the C/C++ Users Journal. You can write to him at dsaks@wittenberg.edu.

Return to November 2001 Table of Contents

Loading comments...

Most Commented

Parts Search Datasheets.com

KNOWLEDGE CENTER