Sequence points

August 26, 2012

JackGanssle-August 26, 2012

Sequence points are one of C's dark corners that trip people up.

While it's easy to brush C off as a super-assembly language it's actually quite rich, expressive and has some dark holes that elude too many developers. Indeed, the C99 standard is over 500 pages long, is rather cryptic, and reading it is a sure cure for the worst case of insomnia.

Consider the following:

a = ++b  +  ++c;

What does it do? No one knows. The C standard leaves the details up to the compiler writer. It does define the notion of “sequence points,” which are nodes where, as the standard states, "At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place." The standard lists where sequence points will occur (Figure 1 below). Note that there are no sequence points within the assignment statement listed above. Whether the result is the sum of the two variables in their incremented or non-incremented states (or with variable a or b incremented and the other not) is entirely undefined.

Figure 1. Where sequence points occur in C code (Sequence points from appendix C of the C99 standard)

Common software standards emphasize this ambiguity. MISRA Rule 12.13 states: "The increment (++) and decrement (--) operators should not be mixed with other operators in an expression." But the explanatory text says: "The use of increment and decrement operators in combination with other arithmetic operators is not recommended…". The assignment operator is not an arithmetic operator (it's defined, logically, as an "assignment operator" in the C standard). So MISRA apparently permits:

a[i++] = i;

… even though the result is undefined, as cryptically noted in the C99 standard:
"Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored"

As a footnote in the standard notes, this means:

 i=++i;

results in unspecified behavior, though of course the logically-equivalent operation:

++i;

is just fine.

CERT rule EXP30-C is much stronger than the MISRA rule: “Do not depend on order of evaluation between sequence points.” In fact, the standard lets compilers evaluate parts of an expression with all of the discipline of sailors at a bar on shore leave. It says “The order in which subexpressions are evaluated and the order in which side effects take place, except as specified for the function-call (), &&, ||, ?:, and comma operators” is unspecified. Many developers think that precedence rules determine which part of an expression gets evaluated first, but that’s not necessarily the case. In the expression

fa() + fb() * fc()

it’s perfectly acceptable for the compiler to evaluate fa() first.

And adding even a Lisp-like swarm of parenthesis doesn’t change anything. We have to rely on the rules of sequence points if evaluation order is important. There’s a fantastic description of this issue at www.eskimo.com. CERT’s software standard explicitly addresses sequence point issues in macros. Rule PRE12-C says: “Do not define unsafe macros.” An example given is:

#define ABS(x) (((x) < 0) ? –(x) : (x))

e.g.:
m = ABS(++n); 

Which will expand to:

m = (((++n) < 0) ? -(++n) : (++n));

And that clearly has no sequence points so it’s behavior is unpredictable. The logical AND (&&) operator does specify a sequence point, but it, and the logical OR, have another twist. The standard reads: “Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation; there is a sequence point after the evaluation of the first operand. If the first operand compares equal to 0, the second operand is not evaluated.” (There’s a similar statement for logical OR.) That means code like:

if (( i == num) && (( j++) > 0){code};

... will not always increment variable j.

CERT warns developers about this sort of code via rule EXP-02-C: “Be aware of the short-circuit behavior of the logical AND and OR operators.” Obviously developers have to be very aware of these sorts of issues to develop reliable code. But there’s another issue: in the last month I’ve visited two different companies that ask sequence point questions on tests for prospective employees! Few applicants get the answers right.

Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at jack@ganssle.com. His website is www.ganssle.com.

Loading comments...