Memory leak & invalid memory deallocation detection in the Linux kernel for modules using exclusively Operating System Abstraction Layer
By Joseph Gasparakis, Rajelekha Kilar Rajaram, and Mario Nicolas
Embedded.com
(05/31/09, 03:20:00 PM EDT)
Memory leaks and memory corruptions are problems that can be easily introduced in code written in C or C++ and generally in any programming language that does not have a garbage collector built in, causing system crashes and sometimes, even worse unexpected behavior, creating bugs that are difficult to be detected.

In user space there are many open source projects that one can use to identify memory leaks and corruptions such as electric fence [6], its newer fork DUMA [7], and valgrind [8]. For code running in kernel context these tools are not available.

When the problems mentioned above end up as a "kernel panic" as they often do, there is no further ability for the engineer to inspect the offending code other than to observe the stack trace. Also in kernel context, a memory leak is persistent: it will remain existent and most probably keep growing until a reboot, something that is unacceptable.

Kernel development allows direct access to the basic components and resources of the operating system, but debugging is not easy as the use of a debugger is not handy and one has to either use additional logging or use advanced techniques such as using kernel debuggers like kdb [2][3] and kgdb [4] or use kernel patches such as the one provided by Catalin Marinas, described in [1].

In this article we will describe our methodology for identifying memory management problems for kernel modules that use Operating System Abstraction Layer (OSAL) without the need of extra patches or kernel debuggers.

OSAL Fundamentals
The Operating System Abstraction Layer is a software layer that allows engineers to develop applications that are OS agnostic [2]. It reduces the porting complexity by providing an API for OS components such as threads, synchronization and memory management allowing portability to different OSes provided there is implementation of the same OSAL and its API.

Figure 1 below shows the concept of a typical software stack using OSAL inside the kernel.

Figure 1 " The concept of using OSAL

Our  Methodology
This methodology has a a perqusite the presence of an OSAL. As shown in Figure 2 below, the header file needs to be compiled together with the source code of the kernel module so it provides macros that replace the actual memory allocating/deallocating functions with a call to the function itself and a log with useful information about the memory address and size involved. Let us call this header file OsalMemLeak.h.

The post processing script later on will parse all the logs generated by the macros and will analyze the data. It will match the allocations to the deallocations and determine if there is a memory leak or memory corruptions created by code freeing memory buffers that were not allocated properly.


Figure 2 " Proposed Method


The header file
OSAL APIs need to be analyzed and a list of those that allocate or free memory need to be identified. The list needs to be further decomposed in the type of memory that they allocate/deallocate (memory buffer, thread, mutex etc).

Let us assume for the sake of an example the following APIs are the memory manipulating ones. This is would be just a subset of the whole OSAL memory allocating/deallocating API, since except for memory allocations and frees, also occurs for other OS components such as threads, semaphores etc.:

void *osal_mem_alloc(int size)
void osal_mem_free(void *memBuffer)
int osal_mutex_init(*Osal_Mutex)
int osal_mutex_destroy(*Osal_Mutex)

After analysis a header file needs to be created. All the above identified and classified APIs should be redefined in this header file as macros.

Those macros, with the help of an inline function, as described before, will call the original function to allocate/deallocate memory and provide any functionality it was required to offer (for example allocate the buffer itself), and then log this to using the kernel function printk().

In order to get the minimum performance impact, the system should be configured so printk() does not print on the screen, only logs and does not ouput in STDOUT or STDERR). For example assuming a function:

void *osal_mem_alloc(int size)

This would correspond to the following macro:

#define osal_mem_alloc(memsize)
__osal_mem_alloc(memsize,__FILE__,__LINE__)

And there would be a definition of the inline function __osal_memAlloc() like this:

static inline void * __osal_mem_alloc(
                    UINT32 size,
                    char * filename,
                    int linenb)
{
    void * addr = (void*) osal_mem_alloc(size);

DPRINT_MEMLEAK("MEM_DETECT_ALLOC:%p:%d:%d:%s:%d:%d\n",addr,size,OSAL_MEM_GUARD_A
LLOC_TYPE_MEMORY, filename, linenb ,current->pid);
        return addr;
}

For the sake of portability please note that the following macro needs to be included:

#define DPRINT_MEMLEAK printk

This is done in order to cater for another operating system where printk() might not exist or not be a suitable logging solution. In the case of Linux using printk() is safe for logging as it can be called from any context and anytime. (For more details how printk() works the reader can refer to [5]).

Now let us assume the following snippet of code in an imaginary snippet:

/* Declarations of the mutex and array of buffers */
Osal_Mutex myMutex;
Osal_Mem_Buffer *myBuffer[10];

/* Initialize the mutex i.e. allocate space for it */
if (osal_mutex_init(myMutex) !=
    OSAL_MUTEX_INITED_OK)
{
        return ERR_MUTEX_INIT;
}
/* Lock the mutex */
osal_lock_mutex(myMutex);

/* LOOP 1: Allocate memory for the 5 first buffers in the array */
for (int i=0 ; i<5; i++)
{
        myBuffer[i] = (Osal_Mem_Buffer)
                                 osal_mem_alloc(1024);
}
/* Allocations complete, so unlock the mutex
*/
osal_unlock_mutex(myMutex);
.
.
.
/* Lock again the mutex */
osal_lock_mutex(myMutex);

/* LOOP 2: Deallocate the 5 first buffers */
for (int i=0 ; i<5; i++)
{
        osal_mem_free(myBuffer[i]);
}

    /* Unlock the mutex */
osal_unlock_mutex(myMutex);
/* Destroy (free) the mutex */
osal_destroy_mutex(myMutex);

If the above code is compiled with the OsalMemLeak.h header file, the following example logs will be generated when the binary runs:

MEM_ALLOCATION:e2000000:4:2:/home/myuser/allocations.c:4:8423
MEM_ALLOCATION:e2abfe00:1024:3:/home/myuser/allocations.c:13:8423
MEM_ALLOCATION:e2ac0200:1024:3:/home/myuser/allocations.c:13:8423
MEM_ALLOCATION:e2ac0600:1024:3:/home/myuser/allocations.c:13:8423
MEM_ALLOCATION:e2ac0a00:1024:3:/home/myuser/allocations.c:13:8423
MEM_ALLOCATION:e2ac0e00:1024:3:/home/myuser/allocations.c:13:8423
MEM_DEALLOCATION:e2abfe00::3:/home/myuser/allocations.c:28:8423
MEM_DEALLOCATION:e2ac0200::3:/home/myuser/allocations.c:28:8423
MEM_DEALLOCATION:e2ac0600::3:/home/myuser/allocations.c:28:8423
MEM_DEALLOCATION:e2ac0a00::3:/home/myuser/allocations.c:28:8423
MEM_DEALLOCATION:e2ac0e00::3:/home/myuser/allocations.c:28:8423
MEM_DEALLOCATION:e2000000::2/home/myuser/allocations.c:31:8423

The post processing script
Complimenting the header file, the post processing script needs to be developed to parse the log file (/var/log/messages) and match the allocations and deallocations based on the address, size and type.

If any address is allocated it must match a following deallocation of the same size and type. Similarly for any deallocation that happens, the script will need to match it with an allocation of the same size and type.

If any mismatches are found, the script will need to generate the appropriate message that there is a possible incorrect allocation/deallocation. For the above set of logs the script will match the memory allocations with the frees.

Assuming the condition that the first for loop (commented as LOOP 1) changes and from i<5 becomes i<=5 instead, and keeping the second loop (commented LOOP 2) the same, a memory leak is induced and one extra log will be created:

MEM_ALLOCATION:osal_mem_alloc():e2ac1200:1024:3:/home/myuser/allocations.c:8423

The script will not match it with a free, causing an error message to be produced.

Multithread warning
This methodology can have a pitfall in multithreading applications.

Assuming the above macro implementation of osal_mem_alloc() and a similar for deallocation, a race condition can occur where the allocation in one thread can be preempted in this stage where the actual allocation occurred but before the logging occurs.

If at that stage the deallocation thread is called that completes before its own preemption. That would result to a proper allocation followed by a deallocation, but the logs of the deallocation will be appearing first, and followed by the allocation logs. This scenario is displayed in Figure 3 below:


Figure 3 " Logs in wrong order due to multithreading


A way to accommodate for this is to have the parsing script to check the PID in case of a deallocation preceding an allocation for the same memory address and type.

If the PID is the same it is definitely a malpractice and if the illegal free hasn't caused a kernel panic, the script should report it as an error. If the PID is different, the script can give out a warning for a possible invalid free rather than an error.

Another possible solution is to statically define a locking primitive (such as a mutex) in the OsalMemLeak.h, and surround all the inline functions with that locking primitive.

Care in this case should be taken if there is a possibility of sleeping or running in the context of an interrupt[9]. Also using this primitive will result to non-portable code. Although one obvious way to recover this is to log as soon as entering the inline function and then have extra logic in the script to determine what happened first, this is a wrong approach.

Being preempted right after the first logging in the allocation and before doing the actual allocation cause the logs to appear in reverse order than the events occurred in case the deallocation gets scheduled right after and completes before the actual memory allocation.

SEC4

Using the method as part of dynamic analysis
Our proposed method of identifying memory leaks and corruptions can easily be used for dynamic analysis in a nightly build and/or test system. The production code can be built with the header file, and then the unit test binaries can be run to produce logs that can be analyzed from the post processing script.

The output of the script can be kept as a log and the script will return zero if no memory allocation warnings and/or errors were found, whereas  on the opposite case, it will exit with an error code.

Evaluating the results
Using the methodology described in this article, a header file needs to be produced. It must replace functions that allocate and deallocate memory with macros that call the actual functions and log data such as memory address and size.

Also, a Perl script that parses the logs generated by the binary when it runs, and analyzes the data producing alerts when possible leaks and memory corruptions occur needs to be created. This methodology has a number of effects on the system being developed, both pro and con:

Pros:
* Does not require any kernel patches or any special setup.
* Works efficiently in a virtualized environment which is one of the most common methods for debugging in kernel space.
* Suitable for RT Linux.
* Not hardware specific.
* Easily portable to any OS.
* Can easily integrate with unit tests and nightly build.

Cons:
* As described earlier in the earlier multithreaded implementation, a thread that allocates memory, for example, can be pre-emptied before logging the allocation with the result that the de-allocation might be logged before the initial allocation.
* In case the logging needs to be disabled (for example when syslogd needs to be disabled) there's no way to get the logs and further use the script to debug.
* Negative (but very small) performance impact with the printk(). Might be crucial in some systems.

Future work
As a result of the work we have done so far, there are a number of directions we could go to extend its functionality, including:

* Add all the kernel functions in order to cover not only OSAL but also software that uses kernel API as well.
* Extend the work for other operating systems.

Joseph Gasparakis is a senior software engineer working for Intel Corp. in Shannon, Ireland for the Performance Product Division of the Embedded Computing Group. He has a 9 year long work experience in telecoms industry working on several proprietary and open source software stacks. He has also worked as system administrator, and he is a Certified Ethical Hacker (CEH). Joseph is currently specializing in optimizations of multicore and network related technologies.

Rajalekha, Kilar Rajaram is working as network software engineer for Intel in Shannon, Ireland for the Performance Product Division of the Embedded Computing Group. She has extensive experience in 3G and 4G wireless protocol stack development on a variety of real time operating systems. She is currently focusing her energy in the development of Media Access Control layer for Wimax Base stations.

Mario Nicolas is a software engineer working for Intel, in Shannon, Ireland for the Performance Product Division of the Embedded Computing Group. Having a mathematical background, his domain of expertise is the implementation and optimization of algorithms, including search, cryptographic and compression/decompression.

References
[1] http://lwn.net/Articles/187979/
[2] http://en.wikipedia.org/wiki/OSAL
[3] http://oss.sgi.com/projects/kdb
[4] http://kgdb.linsyssoft.com/
[5] Pichai Raghavan, Amol Lad, Sriram Neelakandan " Embedded Linux System Design and Development " paragraph 3.7.1, p. 81-82
[6] http://freshmeat.net/projects/efence/
[7] http://duma.sourceforge.net/
[8] http://duma.sourceforge.net/
[9] Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman " Linux Device Drivers " p. 121-123.