Advertisement

Symbolic Constant Expressions

February 01, 2002

Dan_Saks-February 01, 2002

Symbolic Constant Expressions
While symbolic constants will help your code, you can overuse them. Symbolic constant expressions can be just as useful, but without the clutter.

For the past three months, I've been writing about symbolic constants in C and C++. I'd like to wrap up the discussion by making a few observations and recommendations about programming style regarding symbolic constants and constant expressions.

I began my series on symbolic constants by observing that one of the first style guidelines that most programmers learn is to use symbols, rather than raw numbers, to represent arbitrary constant values ("Symbolic Constants," November 2001, p. 55). Rather than write:

char buffer[256];
...
fgets(buffer, 256, stdin);

the prevailing wisdom is that you should define a symbol, say buffer_size, representing the number of characters in the buffer, and use the symbol instead of the literal, as in:

char buffer[buffer_size];
...
fgets(buffer, buffer_size, stdin);

Most C programmers would probably define the symbol as a macro:

#define buffer_size 256

and they'd probably write the name entirely in uppercase. On the other hand, most C++ programmers would define it as a constant object:

const int buffer_size = 256;

Whether I'm writing in C or C++, my preference would be to define it as an enumeration constant:

enum { buffer_size = 256 };

I think many programmers would readily agree that it doesn't matter so much which form you choose as it does that you do choose one. I share that sentiment, but not entirely. I believe you can write code that's every bit as good, and maybe even better, without defining buffer_size at all. Rather, you can use a symbolic constant expression composed of previously defined or built-in symbols.

In the case of my buffered input example, rather than write:

enum { buffer_size = 256 };
char buffer[buffer_size];
...
fgets(buffer, buffer_size, stdin);

I would write:

char buffer[256];
...
fgets(buffer, sizeof(buffer),
stdin);

The constant expression sizeof-(buffer) is every bit as clear as the symbol buffer_size. Granted, the expression sizeof(buffer) is not quite as concise as the name buffer_size, but using the sizeof expression reduces the total number of names declared in the program. If you can get away with fewer declarations in the program without losing any information, this is almost always a good thing to do.

Unfortunately, too many programmers forget, or never realize, that defining symbolic constants is not an end in and of itself. It's just one way to improve the code's readability and maintainability. As with many other style issues, it's helpful to step back once in a while and ask yourself why it is that you program the way you do. In this case, let's reflect on why using symbolic constants is often a good idea.

Why use symbolic constants?

Obviously, a well-named symbol (emphasis on well-named) adds clarity to the program's source code by conveying the meaning of a constant. For example, whereas:

for (i = 0; i < 256; ++i)
  // ...

doesn't tell you much about what 256 has to do with the algorithm:

for (i = 0; i < buffer_size; ++i)
  // ...

tells you that the algorithm iterates once for each element in a buffer.

Using symbolic constants also simplifies maintenance and promotes portability by providing a single change point. Although some constants represent hardware characteristics that never change, many constants represent arbitrary implementation decisions, such as buffer sizes, that are subject to change. If you use a symbol to represent a constant value, you can change the value throughout the program by changing the symbol definition (the single change point) and rebuilding the program.

Like many tasks worth doing well, making up meaningful names can be hard work. Sometimes, there just isn't a better name for a constant than a literal. For example, defining a name such as:

#define ONE 1

probably does more harm than good. It suggests that ONE might someday have a value other than 1.

Often, you can obtain the desired clarity and maintainability by using a symbolic expression composed of existing symbols rather than by defining a new symbol. For example, using an expression such as sizeof(buffer) is just as readable and maintainable, if not more so, than using a symbolic constant such as buffer_size. However, I've known programmers to shy away from using the sizeof operator for fear that it compiles as a function call which executes at run time. It doesn't. Both C and C++ evaluate sizeof expressions at compile time.

Useful sizeof expressions

For a character array, ca, the expression sizeof(ca) is the array dimension (the number of elements in the array) as well as the size of the array (in bytes). For arrays whose elements are not characters, sizeof still yields the size, but not the dimension. You can obtain the array dimension by dividing the sizeof the array by the sizeof one of its elements.

For example, given an array of pointers declared as:

handler *device[16];

where handler is some previously declared type, then the first for loop in Listing 1 iterates once for each element in the device array. Unfortunately, this formula is rather cumbersome and, obviously, less readable than a single symbol.

Listing 1

for (i = 0; i < sizeof(device)/sizeof(device[0]); ++i)  
  // ...

for (i = 0; i < dimension_of(device); ++i)
  // ...

You can clean things up by capturing the general formula for array dimension in a macro, such as:

#define dimension_of(a) \
  (sizeof(a)/sizeof((a)[0]))

Using this macro, the loop looks like the second one in Listing 1, which is quite clear. Using the dimension_of macro often eliminates the need to define symbols representing arbitrary array bounds.

You might prefer to write the loop above using a pointer as the loop control variable. In that case, the loop might look like the first snippet in Listing 2.

Listing 2

for (p = device; p < device + dimension_of(device); ++p)
  // ...
for (p = device; p < beyond(device); ++p)
  // ...

The expression device + dimension_of(device) represents the address of the first byte beyond the end of the device array. The following macro expresses this value more succinctly:

#define beyond(a) \
  ((a) + dimension_of(a))

Then you can write the loop as shown in the second snippet of Listing 2. Again, the compiler can evaluate beyond (device) at compile time to a single pointer value, so the macro call costs nothing.

The preprocessor is indeed prone to abuse, and I certainly don't think you should go hog-wild writing macros. But a few well-designed macros, such as dimension_of and beyond, can make your code more readable while reducing the need for many other global symbols as well.

Defining symbols should not be a religion. Your objective should be to clarify and simplify your programs. Unnecessary or inconsistently named symbols can hurt that goal. Using symbolic expressions can help.

Dan Saks is the president of Saks & Associates, a C/C++ training and consulting company. You can write to him at dsaks@wittenberg.edu.

Return to February 2002 Table of Contents

Loading comments...