Easy to fix, you say? Just change the type of memcpy's third parameter:
void *memcpy(void *s1, void const *s2,
unsigned long n);
You can use this declaration to write a memcpy for an I16LP32 target, and it will be able to copy large objects. It will also work on IP16 and IP32 platforms, so it does provide a portable declaration for memcpy. Unfortunately, on an IP16 platform, the machine code you get from using unsigned long here is almost certainly a little less efficient (the code is both bigger and slower) than what you get from using an unsigned int.
In Standard C, a long (whether signed or unsigned) must occupy at least 32 bits. Thus, an IP16 platform that supports Standard C really must be an IP16L32 platform. Such platforms typically implement each 32-bit long as a pair of 16-bit words. In that case, moving a 32-bit long usually requires two machine instructions, one to move each 16-bit chunk. In fact, almost all 32-bit operations on these platforms require at least two instructions, if not more.
Thus, declaring memcpy's third parameter as an unsigned long in the name of portability exacts a performance toll on some platforms, something we'd like to avoid. Using size_t avoids that toll.
Type size_t is a stypedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform.
Using size_t
The definition for size_t appears in several Standard C headers, namely, <stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h>, and <wchar.h>. It also appears in the corresponding C++ headers, <cstddef>, <cstdio>, and so on. You should include at least one of these headers in your code before referring to size_t.
Including any of the C headers (in a program compiled as either C or C++) declares size_t as a global name. Including any of the C++ headers (something you can do only in C++) defines size_t as a member of namespace std.
By definition, size_t is the result type of the sizeof operator. Thus, the appropriate way to declare n to make the assignment:
n = sizeof(thing);
both portable and efficient is to declare n with type size_t. Similarly, the appropriate way to declare a function foo to make the call:
foo(sizeof(thing));
both portable and efficient is to declare foo's parameter with type size_t. Functions with parameters of type size_t often have local variables that count up to or down from that size and index into arrays, and size_t is often a good type for those variables.
Using size_t appropriately makes your source code a little more self-documenting. When you see an object declared as a size_t, you immediately know it represents a size in bytes or an index, rather than an error code or a general arithmetic value.
Expect to see me using size_t in other examples in upcoming columns.
Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at dsaks@wittenberg.edu. For more information about Dan click here .
Reader Response
Either your article contains an error or gcc (at least the versions I've used) contains an error.
gcc actually defines size_t as a signed integer type. This means that using size_t rather than an explicit integer type actually *creates* portability annoyance when code is used both with gcc and with a compiler that defines size_t as an unsigned integer type. Most of these annoyances come from sloppy casts of constants or variables to (unsigned) or (unsigned long) rather than (size_t) to silence warnings about comparisons between signed and unsigned values. Such casts silence one compiler, but ensure the same signed vs. unsigned mismatch when using other compilers. Those are relatively easy to fix, but this still leaves the question of proper bounds checking for size_t values. If size_t is signed then bounds checking is a doubly ended issue. If size_t is unsigned (and the lower bound in question is 0) then only one bound must be checked.
I've not checked the standard, or the latest versions of gcc, but this has created difficulties for me and my colleagues in actual practice.
-Virgil Smith
Senior Engineer
ICx Nomadics
Stillwater, OK