A case study in portability
Writing highly portable code in C or C++ is possible, but not always as easy as we'd like it to be.
I recently wrote a couple of columns in which I explained why the Standard C and C++ libraries define a typedef named size_t. 1, 2 I concluded the second of those columns with an explanation of how to print size_t objects. I followed that with a column about another typedef named ptrdiff_t, in which I gave advice on printing ptrdiff_t objects similar to the advice I gave for size_t objects.3
Unfortunately, that advice wasn't entirely correct. Reader Brandon Taylor pointed out the flaw in my advice in a letter that appears online along with my column about ptrdiff_t.3 My reply to his letter briefly explains how to correct the flaw. This month, I'll elaborate on that explanation. This example shows how easy it is to introduce assumptions into your code that restrict portability and how you can exploit features of Standard C to avoid such restrictions.
Here, in part, is what I wrote about size_t:2
According to the 1999 C Standard, you should use the z length modifier with the u conversion specifier to display a size_t object, as in:
size_t n; ... printf("%zu", n);
If your compiler doesn't support %zu, then you should try %lu (unsigned long), as in:
size_t n; ... printf("%lu", (unsigned long)n);
Again, size_t is an alias for either unsigned or unsigned long, so converting a size_t to an unsigned long produces an unsigned that's either the same size as a size_t, or wider. It won't lose significance.
Actually, it might. What I forgot was that size_t might be an alias for unsigned long long, in which case casting a size_t to unsigned long might cause a truncation.
Using the %llu format and a cast to long long, as in:
printf("%llu", (long long unsigned)n);
might be less likely to cause a truncation, but still offers no guarantee. A Standard C implementation may define size_t as some implementation-defined integer type larger than long long unsigned. For example, a compiler for a platform in which an unsigned long long occupies 64 bits might also support a platform-specific type uint128_t, which occupies 128 bits, and size_t might be an alias for this larger type. I'm not aware of any compiler that does this, but it's possible.
A similar problem arises when using ptrdiff_t. It might be an alias for a signed type larger than long long.
The Standard C header <stdint.h> defines intmax_t as an alias for the "largest" signed integer type--whichever signed integer type can represent the value of any other signed integer type. It defines uintmax_t as the largest unsigned integer type. The header also defines other useful types, such as uintptr_t--an alias for an unsigned integer type that can hold the value of a pointer. It's worth a few moments of your time to look it over.
Once again, the Standard C way to print a size_t object n is to use the %zu format specifier. If your compiler doesn't support %zu, your next best bet is to convert n to a type whose format is supported. It's always safe to convert a size_t to uintmax_t, but then what format do you use?
The Standard C header <inttypes.h> (not to be confused with <stdint.h>) defines macros that you can use to format types such as intmax_t and uintmax_t. For example, the header defines PRIuMAX as a character string literal containing a partial format specifier for printing a uintmax_t object. The character string doesn't include the leading %. For a compiler in which uintmax_t is an alias for unsigned long long, PRIuMAX would likely expand to "llu".
To use PRIuMAX as a complete format specifier, you must provide a leading %, as in:
printf("%" PRIuMAX, (uintmax_t)n);
When compiled, the preprocessor replaces PRIuMAX with a string literal, yielding something like:
printf("%" "llu", (uintmax_t)n);
When the compiler sees two adjacent string literals, it concatenates them at compile time into a single literal. In this example, the compiler splices "%" followed by "llu" into the single literal "%llu".
But wait, there's yet another way to do this. You can define a macro with the format specifier for size_t, and use that in place of PRIuMAX. Then you can eliminate the cast to uintmax_t. Eliminating casts is almost always a good thing. It is in this case.
For example, if your compiler defines size_t as an alias for unsigned long, then you can define a macro ZU as:
#define ZU "lu"
Use it in place of the zu specifier. Then you can print a size_t object n simply as:
printf("%" ZU, n);
You will have to define the ZU macro appropriately for each target platform. For those that actually support the zu modifier, you can define it as:
#define ZU "zu"
Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at firstname.lastname@example.org. For more information about Dan click here .
1. Saks, Dan. "Why size_t matters," Embedded Systems Design, July 2007, p. 11.
2. Saks, Dan. "Further insights into size_t," Embedded Systems Design, September 2007, p. 9.
3. Saks, Dan. "Standard C's pointer difference type," Embedded.com, October 2007.