Advertisement

Enumerations Q & A

April 10, 2003

Dan_Saks-April 10, 2003

Enumerations Q & A

It's mailbag time. In response to reader feedback, Dan takes a moment to clarify some nuances of using enumerations.

A few months ago I wrote about using objects of enumeration types as loop counters.[1] I followed that up with advice on how to define your enumeration types so that they behave well at their boundary values.[2] I had planned to continue exploring variations on this theme, and I still do. However, I received an unusually large volume of feedback about the first column. The responses included some interesting questions and comments that I'd like to share with you.

Enumeration basics
An enumeration definition specifies a type and a corresponding set of named constants. For example:


enum day
  {
  Sunday, Monday, Tuesday,
  Wednesday, Thursday, Friday,
  Saturday
  };
typedef enum day day;

defines a type day with seven constants. The typedef immediately after the enumeration definition elevates the name day from a mere tag to a full-fledged type name. None of these constants has an explicitly specified value, so the first enumeration constant, Sunday, has the value of 0, and each subsequent constant has a value one more than the preceding constant.

In C, each enumeration type is compatible with char or some signed or unsigned integer type. You can perform arithmetic on enumerations, including ++ and --, and thus write loops that step through the days as:

In C++, each enumeration is a distinct type. The built-in ++ and -- operators do not apply to enumerations, so the loop above won't compile unless you define a prefix ++ operator for type day, as in:


inline
day &operator++(day &d)
  {
  return d = day(d + 1);
  }

Cast expressions
When I presented the definition for operator++, I glossed over the meaning of the cast expression day(d + 1). One of my readers wrote to ask if I had meant to write (day)(d + 1) instead. In fact, in my earlier columns where I explained how to write an operator++, I wrote the cast as (day)(d + 1) specifically to avoid this question.[3,4] Now that I've slipped up, I should explain why what I wrote is valid.

C has only one cast notation. In C, an expression of the form (T)E yields the value of expression E converted to type T. C++ offers three styles of casts:

(T)E — traditional "C style"
T(E) — "function style"
static_cast<T>(E) — "new style"

T(E) is equivalent to (T)E. static_cast is actually one of three "new style" casts; the others are const_cast and reinterpret_cast. In some future column, I'll explain why C++ has these alternative casts, but for now I'll just say that all of these:

(day)(d + 1)
day(d + 1)
static_cast<day>(d + 1)

are equivalent. They are not necessarily equivalent when casting to something other than an arithmetic or enumeration type.

Leaving room at the ends
Whether you use a built-in or user-defined ++, the loop:


for (d = Sunday; d <= Saturday; ++d)
  ...
  

increments d until its value is one more than Saturday. In general, incrementing an enumeration value beyond its specified range of values produces undefined behavior. If you expect to write such loops, your best bet is to include an enumeration constant that represents "one beyond the end." In my "Enumerations as Counters" column, I suggested defining not_a_day as the value after Saturday.[1] Last month, I suggested defining a value before the beginning as well so that you can use the enumeration to count down, too.[2] I also advised using a matching pair of names for these out-of-range values, as in:


enum day
  {
  day_before = -1,
  Sunday, Monday, Tuesday, 
  Wednesday, Thursday, Friday, 
  Saturday,
  day_after
  };

One reader observed that introducing additional enumeration constants for out-of-range values creates a practical coding problem, illustrated by the following example:


int opening_time(day d)
  {
  switch (d)
	{
	case Monday:
	  return closed_on_Mondays();
	case Tuesday:
	case Wednesday:
	case Thursday:
	case Friday:
	  return 10;
	case Saturday:
	case Sunday:
	  return 12;
	}
}

When you compile this function, some compilers will issue warnings that the switch statement doesn't handle all possible enumeration values. These warnings are useful and worth heeding. In this case, ignoring the warning leaves you with a program that could have undefined behavior.

The warnings would not be there were it not for the additional constants day_before and day_after. You could eliminate the warnings simply by not defining these additional constants. If you don't think you'll ever use day as the type of a loop counter, then you don't really need these constants. But I think day is a type that's quite natural as a loop counter, so I would choose to keep the constants. You can keep the constants and eliminate the warnings by using a default clause in the switch statement, as in:


switch (d)
	{
	case Monday:
	  return closed_on_Mondays();
	case Tuesday:
	case Wednesday:
	case Thursday:
	case Friday:
	  return 10;
	case Saturday:
	case Sunday:
	  return 12;
	default:
	  /* report an error */
}

I believe you should use a default clause to catch out-of-range values even if you don't explicitly define out-of-range constants such as day_before and day_after. Remember, the compiler implements each enumeration type as some underlying integer type, so a day object can, through a mishap elsewhere in the program, take on an underlying value of, say, 42. A default clause like the one above is a good way to intercept such errors before they do too much damage.

Wrapping around
A couple of readers suggested implementing the ++ operator for day so that when day d is Saturday, ++d yields Sunday. You might implement this as:


inline
day &operator++(day &d)
  {
  return d = day((d+1) % (Saturday+1);
  }

One benefit of this approach is that it implements the concept that Sunday follows Saturday. On the other hand, it turns:


for (d = Sunday; d <= Saturday; ++d)
  ...

into an infinite loop. I think most programmers would consider this an unwelcome surprise. My preference is to stick with my original implementation of operator++.

Maximum and minimum values
When you deal with types representing everyday phenomena, such as days or months, it's easy to remember which is the first (lowest) value and which is the last (highest) value. I don't think anyone familiar with the English language and western culture would have trouble recognizing that:


for (m = January; m <= December; ++m)
  ...

counts through the entire range of valid months.

Still, many enumerations don't have an obvious minimum and maximum. For example, how do you remember which are the first and last values of:


enum currency
  {
  CAD, CHR, EUR, GBP, JPY, SFR, USD
  };

and how can you write loops that continue to work even if you add, remove, or reorder the enumeration constants? In fact, some people might even differ over whether Sunday or Monday is the first day of the week.

To solve this problem, I recommend declaring additional enumeration constants representing the minimum and maximum values for the type, as in:


enum currency
  {
  currency_before = -1,
  currency_min,
  CAD = currency_min,
  CHR, EUR, GBP, JPY, SFR, USD,
  currency_max = USD,
  currency_after
  };

Then you can write loops such as:

c = currency_min;
for (; c <= currency_max; ++c)
  ...

which counts through the currencies without regard to which specific currency is first and last.

One reader expressed surprise that you can define an enumeration in which two or more enumeration constants have the same value. I checked, and the C standard sanctions this explicitly:

The use of enumerators with = may produce enumeration constants with values that duplicate other values in the same enumeration.[5]

Another reader suggested that this style invites maintenance problems. For example, a programmer might add another currency after USD, yet fail to update the value for currency_max. This is a valid concern. Adding those extra constants makes the enumeration definitions look too cluttered.

I've been playing with ways to clean things up and reduce opportunities to introduce errors. I'm leaning toward defining the _min and _max values in terms of the _before and b values, respectively, as in:


enum currency
  {
  currency_before = -1,
  CAD, CHR, EUR, GBP, JPY, SFR, USD,
  currency_after,
  currency_min = currency_before+1,
  currency_max = currency_after-1,
  };

If you've got a better idea, I'd love to hear from you.

Mixed metaphors?
More than one reader suggested defining the enumerations using a simpler style than what I suggested. In their style, the definition for type day looks like:


enum day
  {
  Sunday, ..., Saturday,
  number_of_days
  };

Then a loop that counts through the days would look like:

d = Sunday; for (; d < number_of_days; ++d) ...

A symbolic constant named number_of_days is certainly useful, but defining it this way is inappropriate. A constant such as Sunday represents a day. In contrast, number_of_days is not a day; it's an integer representing the number of days. A constant representing a day is conceptually different from a constant representing the number of days, so they shouldn't be declared with the same type.

If you really want a constant called number_of_days, I would advise you to define it as:


enum day
	{
	...
	};
enum
	{
	number_of_days 
	= day_max - day_min + 1
	};

Now it's easier to see that number_of_days is an integer, not a day. (Enumeration constants in unnamed enumeration definitions behave like integers.)

Dan Saks is the president of Saks & Associates, a C/C++ training and consulting company. You can write to him at dsaks@wittenberg.edu.

References
1. Saks, Dan, "Enumerations as Counters," Embedded Systems Programming. December 2002, p. 36.
Back

2. Saks, Dan, "Well-Behaved Enumerations," Embedded Systems Programming. April 2003, p. 41.
Back

3. Saks, Dan. "An Introduction to References," Embedded Systems Programming. January 2001, p. 81.
Back

4. Saks, Dan, "References vs. Pointers," Embedded Systems Programming. April 2001, p. 103.
Back

5. International Organization for Standardization, ISO/IEC 9899:1999: Programming languages — C, Geneva, Switzerland, 1999.
Back

Loading comments...