Further insights into size_t
Using size_t may be awkward for some programmers, but using it still solves more problems than it creates.
In my previous column, I explained why both the C and C++ standard libraries define a typedef named size_t and how you should use that type in your programs.1 That article generated quite a few interesting questions and comments, some of which I'd like to share with you.
Is size_t really unsigned?
One diligent reader noticed that his compiler didn't implement size_t as I said it should:
Either your article contains an error or gcc (at least the versions I've used) contains an error.
gcc actually defines size_t as a signed integer type. This means that using size_t rather than an explicit integer type actually *creates* portability annoyance when code is used both with gcc and with a compiler that defines size_t as an unsigned integer type. Most of these annoyances come from sloppy casts of constants or variables to (unsigned) or (unsigned long) rather than size_t to silence warnings about comparisons between signed and unsigned values. Such casts silence one compiler, but ensure the same signed vs. unsigned mismatch when using other compilers. . . .
I've not checked the standard, or the latest versions of gcc, but this has created difficulties for me and my colleagues in actual practice.
According to the 1999 C Standard, size_t is clearly supposed to be unsigned.2 In clause 7.17, Common definitions <stddef.h>, it says:
The following types and macros are defined in the standard header <stddef.h>. Some are also defined in other headers, as noted in their respective subclauses.
The types are . . .
which is the unsigned integer type of the result of the sizeof operator;
size_t is unsigned in every compiler I tested, including gcc. I'm using a build based on gcc 3.2.3.
I poked around on the web and found some old GNU C Library maintenance documentation at www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_30.html, which states:
There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type . . .
That documentation provides additional insights into gcc's handling of size_t and how you can tweak it.
I disagree with the reader's claim that " . . . using size_t rather than an explicit integer type actually *creates* portability annoyance when code is used both with gcc and with a compiler that defines size_t as an unsigned integer type." I presume that "explicit integer type" means an integer type specified by keywords such as int or unsigned, as opposed to a typedef such as size_t.
Using size_t properly may create some annoyances, but it eliminates many more than it creates. For example, the standard strlen function returns a size_t. Code such as:
char *s; size_t len; ... len = strlen(s);
will compile without complaint--and work--whether the library defines size_t as signed or as unsigned.
In contrast, declaring len explicitly as either int or unsigned is much more likely to cause portability problems. Specifically, if you declare len as:
then the assignment:
len = strlen(s);
may provoke type mismatch warnings (an unsigned to signed conversion) when compiled with a library that defines size_t (properly) as unsigned. Similarly, if you declare len as:
then the same assignment will likely generate warnings when compiled with a library that defines size_t (improperly) as signed. Using size_t actually insulates your code against failure even when using a compiler and library that define size_t incorrectly. All the more reason to use size_t.
"Sloppy" vs. "clean" casts
The reader observed that "Most of these annoyances come from sloppy casts of constants or variables to (unsigned) or (unsigned long) rather than (size_t) to silence warnings about comparisons between signed and unsigned values." This is not so much an argument against using size_t appropriately as it is an acknowledgment of the consequences of using size_t inappropriately.
For example, consider:
int n; ... if (strlen(s) > n) ...
Some compilers will issue a warning that the expression in the if-statement is comparing an unsigned value (the size_t returned by strlen) to a signed value (the int in n). These warnings are often helpful in catching potential logic errors, and I advise you to leave them turned on. The best way to silence the warning is fix the problem by changing the declaration of n so that it has type size_t. If that's infeasible (probably for political rather than technical reasons), then your only recourse is to use a cast. Writing:
if (strlen(s) > (unsigned)n)
will quell the compiler, but this cast is arguably "sloppy". Neither n nor the return type of strlen is declared as unsigned, so why cast to that? A cleaner approach would be either to cast one operand to the declared type of the other.
Type size_t might be an alias for unsigned long. In that case, casting a size_t to int, as in:
if ((int)strlen(s) > n)
could truncate the size_t value. It would be bad.3 Writing:
if (strlen(s) > (size_t)n)
would be better. This works correctly even if the library incorrectly defines size_t as a signed integer.