Any kind of software needs memory to store data. How much data is stored, how it is accessed, and the processing performed on that data can vary greatly from one application to another.
In a desktop system, the approach to programming assumes that the system has infinite resources (memory) and that human intervention is a possibility. But most embedded applications have limited resources and may be deeply embedded, which means they need to be entirely self-sufficient.
Broadly speaking, in C/C++ data can be stored in three ways:
- In static memory, where the storage is allocated at build time. Variables defined outside of functions are intrinsically static. Inside a function, variables may be directed to static storage using an explicit static declaration.
- On the stack or in a CPU register. Variables defined inside functions are stored in this way by default.
- Dynamically allocated and de-allocated, as required. This allocation and de-allocation is performed using the library functions malloc() and free() in C or the new and delete operators in C++.
The heap is an area of memory set aside to provide dynamically allocated storage. It is managed by the library functions malloc() and free() in C or the new and delete operators in C++. The size and location of the heap are application dependent and the means by which it is allocated is specific to a particular toolchain.
Dynamic = bad
When designing embedded software, if there is a choice between doing something statically or dynamically, the former should always be the first choice. This is because there are two distinct issues with most dynamic activities, such as resource allocation:
- Failure mode
Many, if not most, embedded applications are required to exhibit real time behavior. A key factor in achieving this goal is determinism – the predictable timing of a process. Dynamic memory allocation is typically non-deterministic. Although, as we shall see, this can be overcome.
As mentioned earlier, most embedded systems do not have a human operator who can make a decision if an exceptional event occurs. On a desktop computer a message of the form: “Blah blah not available. Please try later” may be inconvenient; on an embedded system it is probably not an option. Think heart pace-maker or automobile engine management unit.
Limitations of malloc()
The standard C library functions (and C++ operators) exhibit all the features that support the “dynamic = bad” rule. A normal malloc() implementation is not deterministic, which means that its use should be avoided in real time code. If a memory allocation is attempted and insufficient heap space is available, a NULL pointer is returned. This can be detected and evasive action taken, but this may be complex to implement.
Running out of heap space is an unsurprising failure mode. In theory, at least, a facility could be provided to show how much space is available. This would enable a failure to be predicted and possibly avoided. However, this does not take fragmentation into account.
If a series of calls are made to malloc() to grab some memory and the corresponding calls to free() are made in the reverse order, memory is gradually consumed from the top of the heap and then the allocated area is gradually reduced from the bottom. The code would look like this:
p1 = malloc(1024);
p2 = malloc(2048);
p3 = malloc(1024);
As shown in Figure 1 , If the heap had a capacity for 8K of memory allocation, the malloc() calls would consume 4K, and 4K would still be available. At the end of this sequence, the state of the heap at the end of the code sequence would be the same as at the start, with all 8K available. However, real code is not like that. Consider this sequence:
p1 = malloc(1024);
p2 = malloc(2048);
p3 = malloc(1024);
At the end of this sequence, 6K of memory is available. However, as illustrated in Figure 1, the largest contiguous area is only 4K. So, an allocation request for, say, 5K would fail, even though there is 6K free. This is memory fragmentation and can lead to unpredictable failures.
Block memory allocation
There is an alternative to malloc() that is commonly offered as a service with a real time operating system (RTOS): the allocation of memory in fixed size blocks. The idea is quite simple – one or more pools of memory blocks are defined; these have a specific, fixed block size and a defined number of blocks. Multiple pools facilitate a choice of block size.
It is quite straightforward to implement block memory allocation in a deterministic fashion. Also, this approach immediately eliminates fragmentation. Of course, there is still the possibility that an allocation will fail because there are no further blocks. However, this is entirely predictable and hence, can be mitigated. Additionally, RTOSes typically offer a “blocking” allocation call, which means the task requiring memory may be put to sleep if no blocks are available, and wakened when there is availability.
Another creative way to implement dynamic memory allocation is to use block memory services to implement malloc() . The simple approach is to implement a series of block memory pools with sizes that are powers of two – maybe 16, 32, 64, 128 bytes, for example. It is then a simple matter to code malloc() so that a block which is large enough for the requested memory is allocated. This does lead to possible waste – what if an application needs a block of 33 bytes, for example? – but all the problems of determinism and failure modes are addressed.
Putting aside the problems of timing and resource exhaustion, is there a way to solve the problem of memory fragmentation while allowing malloc() to use a conventional heap?
Some languages – and Java is a notable example – address fragmentation using a garbage collector. From time to time, when a memory allocation is attempted, the garbage collection routine is run, which moves all the allocated blocks of memory around so that the free space is in a single contiguous block. There are actually more efficient ways to achieve a good result, but this gives the idea.
However, garbage collection has two major downsides:
- Memory allocation becomes very non-deterministic, as it is not possible to predict when the garbage collector will be run or how long it might take. This may be addressed to some extent by executing the code at regular, defined intervals.
- In C/C++, memory allocation results in a pointer being returned. This is the address of the allocated memory. If that memory is moved, the pointer is no longer valid. This could be solved by returning a pointer to a pointer, but this is non-standard and would lead to more complex application code.
I have wondered whether a memory management unit (MMU) might be used in a creative way. It would seem logical that the blocks of free space could be re-mapped so that they appear to be a single contiguous block. I would be interested to know whether this has ever been attempted.
The word “dynamic” should – and normally does – ring alarm bells with embedded software developers. The dynamic allocation/creation of anything is fraught with problems in real time and deeply embedded systems, so, wherever possible, static alternatives should be considered. Dynamic memory allocation is a good example of where this caution is wise, but a clear understanding of the issues can lead to implementations that are fit for purpose.
Colin Walls has over thirty years of experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, Colin is an embedded software technologist with Mentor Embedded (the Mentor Graphics Embedded Software Division), and is based in the UK. His regular blog is located at: blogs.mentor.com/colinwalls . He may be reached by email at .