Reentrancy in Protocol Stacks

T. Sridhar

November 01, 2001

T. SridharNovember 01, 2001

Reentrancy in Protocol Stacks
The individual layers of protocol stacks are often implemented as tasks or library routines. Because this code may be invoked by more than one application task or upper layer protocol at a time, it must be designed for reentrancy.

When I was taking my first course in assembly language programming, all of the assignments required reentrant code. The code was not to have any global variables and was to use the stack for all variable storage-this was the guideline. For a fledgling programmer, it was not easy to write such code, especially in assembly language. Years later, the benefits of reentrant code now clear, I can see the wisdom in my instructor's insistence on reentrant code.

Simply put, reentrant software is code that can be invoked by multiple entities at the same time without side effects.[1] C is also well-suited for reentrancy, having local variables allocated on the stack for each function call; different stack frames for each invocation make local variables safe. Reentrancy problems are typically related to globals.

Consider the code in Listing 1, which is a simple example of a function that can operate on any list of element type LISTELEMENT, irrespective of when it is invoked. The function does not change any global variables and only operates on a list whose pStart pointer is a passed parameter. The function has a local variable that is allocated on the stack and whose value is not retained across invocations (unlike a local variable defined static). If another caller invokes the function with another list's start pointer, the two invocations can proceed as separate threads.

Listing 1: A reentrant function

struct LISTELEMENT
    {int value;struct LISTELEMENT *pNext;
    }
;
struct LISTELEMENT *
GetLastElementofList(struct LISTELEMENT * pStart)
    {struct LISTELEMENT *pElement = pStart;
    if (pElement == NULL) return;
    while (pElement->pNext != NULL)
    pElement = pElement->pNext;return (pElement);
}

An interesting note here is that if a function invokes another function, the invoked function must also be reentrant. This is one of the reasons that the compiler-provided libraries need to be reentrant, since they may be called from code designed for reentrancy. Another area where reentrancy is useful is in developing code that can be called from a task and an ISR.

Reentrant code is more flexible and lends itself to reuse. That, at least, is the theory. In the real world, a routine typically modifies variables referenced by other routines; the system design may even require it. Modification of global variables makes code non-reentrant. In a large system with more complex interactions, reentrancy is a little more difficult to achieve, which is one of the reasons that a substantial amount of system code is not designed for reentrancy.

Protocol stacks

A communication protocol is usually defined by a protocol specification, which specifies a sequence/relationship in exchanges between entities. The software implementing a suite of related protocols is known as a protocol stack. Examples of protocol stacks include TCP/IP, X.25, and ATM.

Protocol stacks typically act as a service provider for a higher layer application or protocol. The higher layer is known as a service user. Multiple applications may need to use the services of the lower layer protocol at the same time. For this, the protocol should be able to support multiple simultaneous applications without the overlap or corruption of data related to another application. This is an area where reentrant code is useful.


Consider the diagram in Figure 1. Here the IP layer interfaces at the higher layer with UDP, TCP, and the OSPF protocols. Each higher layer protocol should be able to call the IP layer at the interface to pass on data to be encapsulated in an IP packet. The interface function ip_send() should be able to operate independently with every invocation. It should not, for example, use global data structures for writing (more on that later), which can cause a problem.

Is the reentrancy required only at the interface between the higher and the lower layers? How about the other functions in the IP protocol stack implementation? Recall that a function will be reentrant only when the other functions it calls are themselves reentrant. Extending this, it should be clear that all the code in the path of ip_send() needs to be reentrant.

One might ask why the ip_send() code should be reentrant, if there is only one thread of execution and it has to run to completion. For example, if the ip_send() code is called by the TCP layer, and the entire thread has to complete execution before the UDP layer can call it, there is no need to worry about reentrancy. This is true, to a point. However, we still have to consider context. For example, if TCP were running at a higher priority than UDP and IP was not really a task but a library, UDP could be interrupted in the middle of ip_send(). In this case, if ip_send() were not reentrant, the UDP context and TCP context may corrupt each other-definitely not a desirable situation.

What if IP were a task and TCP and UDP were queueing their requests in a message queue to the IP task? In such a case, lack of reentrancy in the code is not really an issue, since the requests would be processed in sequential order. However, it is best not to design protocol stack code with a specific task/library architecture in mind. It's better to write reentrant code, so it can be used in both task-based and library-based environments.

Achieving reentrancy

To make the code reentrant, we should avoid using global data structures that might be referenced by other modules. It should not have any "residual" effect for a routine that would affect other invocations of the routine.

Consider the following logic for a simple ip_send() routine (a number of simplifications have been made for sake of brevity):

1. Determine if there is a route to the destination and, if there is, select the next hop.

2. Determine the encapsulation for the network interface and encapsulate the packet.

3. Send the packet to the network interface driver via the driver's send() routine.

4. Increment the outgoing packet counter for that network interface.

5. Return a value based on the successful or failed transmission result indicated by the driver.

Here, the routing table that is examined is most likely a global data structure. Does this conflict with the reentrancy requirements? Not really, because this global data structure is only read and not manipulated for writing. See Figure 2. The counter increment is a potential problem, though.


Note that the send() routine for the driver should also be reentrant. In many communication systems, the driver simply enqueues the frame for transmission by the hardware, and updates its own and the controller's data structures. In case this code cannot be made reentrant, atomicity should be ensured via a mutual exclusion primitive like a semaphore.

Virtual routers and incarnations

In this discussion, I have maintained that using reentrant code in protocol stacks is generally desirable. However, reentrancy addresses a specific need in some network equipment. These are the "virtual routers" that are part of the software implemented in devices like IP service switches.


Consider an IP services router located in an ISP Point of Presence (PoP), like the one shown in Figure 3. This network equipment is built to provide the equivalent functionality of multiple routers in a single box. This reduces the number of boxes and corresponding management and interconnection overhead. Multiple enterprises would have to connect over individual WAN links to a shared IP services router. Each enterprise would have the illusion that it was connecting to its own individual router. The IP services router would have multiple "virtual routers" (VRs) in the same box and logically associate a VR with the appropriate enterprise connection(s). VRs have independent routing (and forwarding) tables, which they use for individually forwarding packets to the "next hop" in their domain.[2] They build these routing tables independently via individual instances of the appropriate routing protocol (such as OSPF) communicating with routing protocols on peer routers.

The software in the IP services router would typically need to run multiple incarnations to provide the VR functionality. For example, there would be multiple incarnations of the OSPF routing protocol, one for each virtual router. Each incarnation should have to be isolated from the other incarnations to make this model work. One approach that springs to mind is to have an individual process for each incarnation, as in the case of a commercial desktop OS like UNIX, where the system provides the memory isolation between processes. For example, in UNIX, when multiple users require the services of the "vi" editor, multiple vi processes are invoked using the same copy of the code. However, since each process is isolated with respect to data memory, there is no need for reentrant code.

This approach works well as long as the number of incarnations required is small. When you need hundreds or thousands of incarnations (typical in IP services routers), the memory required for the individual incarnation contexts, as well as the context switching between the individual processes, becomes too much of an overhead.

A better approach is to use a single task to run the VR functionality via the use of reentrant code. The use of global data structures would need to be managed to fit this model and this is discussed in subsequent sections.

Code for incarnations

A simple way to implement incarnation-independent code is to use an incarnation number as a parameter to every routine that needs incarnation-dependent isolation. This incarnation number needs to be determined via an external interface.

Consider the case of an IP services router with a VR for each port. When a packet arrives on a port, the corresponding VR is already known. A dispatching agent can then call the entry point for the corresponding incarnation via an incarnation number parameter and pointer to the packet. Inside the code for the incarnation, the incarnation number is passed as a parameter to help the decision process and access the appropriate data structures. For example, consider this algorithm:

1. Parse the packet.

2. Access the appropriate data structures via the incarnation number.

3. Continue the packet processing and forwarding based on the routing table and other data structures accessed via the incarnation number.

This entails that all global data structures used by the code are replicated according to incarnation number. A translation mechanism needs to be provided to access the appropriate global data structure given the incarnation number. Do all functions need to have an incarnation number parameter? Only if the function needs to access global data structures. Reentrant code that does not need to access any global data structures can avoid using the incarnation number parameter. Alternatively, the incarnation number parameter can be replaced by a "root pointer" parameter, which points to the root data structure or "meta" data structure for that incarnation, as detailed in the next section.

Data structures for incarnations

How do data structures look in this environment? Consider the following example: each table is accessed based on an incarnation number translation. The mechanism returns the pointer to the appropriate data structure for the incarnation given the incarnation number. While the structure of the tables is the same across incarnations, the access to the tables is routed through the translation mechanism and this keeps the isolation intact. The translation mechanism need not be very complex. For example, an array-based incarnation number mechanism can provide the translation to the appropriate context.

Typical code to access the data structures according to incarnations would look like:

pRoutingTable = TranslateRoutingTable(N);
// Determine route to destination.
// Forward packet to route.
pStatTable = TranslateStatisticsTable(N);
// Update statistics.


The problem with this approach is obvious. Multiple routines need to be written for translation according to the data structure that needs to be accessed/updated. Using a start pointer for data structures for an incarnation is a common alternative. See Figure 4. All the data structures associated with an incarnation can be accessed via a single structure pointer. Via this "meta" data structure, the data structures for the individual tables can be accessed via pointers. The equivalent code would look like:

pRoot = TranslateIncarnation(N)
pRoutingTable = pRoot->pRoutingTable;
// Determine route to destination.
// Forward packet to route.
pStatTable = pRoot->pStatTable;
//Update statistics.

Buffer and timer management

Communications protocols and systems make use of buffers and timers for their operation. Typically, buffers are used to manage data from and to a hardware port, while timers are required for periodic updates and timeouts that need to be handled by the protocol. Consider the IP services router example. Working from the definition that we need to have multiple VRs functioning independently of each other, each VR would need to have its own buffer management and timer management strategy in place.

In a traditional router, there would be one or multiple global buffer pools, each composed of buffers of a specific size. The tasks would allocate one or more buffers from a pool, depending on the size of the message. In case multiple buffers are needed, they would be chained (linked) together to hold the message. A VR would essentially do the same. There would be a set of buffer pools, one per VR. Configuration, status, and statistics to maintain this buffer pool would take the form of data structures referred to and accessed in a manner similar to the translation-based approach specified earlier. The routines to manipulate buffers would themselves need to be reentrant and would need a "root pointer" (to access the meta data structure discussed before) to the global buffer data structure to manipulate the appropriate data structures for the buffer pool.

It may be observed that the propagation of incarnations to buffer pools may lead to scaling issues when there are hundreds of incarnations. The alternate approach is to use a single global pool set of buffers for all VRs. All incarnations will use this same pool for allocation and de-allocation. While this may seem to buy us some flexibility or ostensible complexity, it does not provide full isolation between VRs, since multiple VRs end up contending for the same buffers. Also, VR-specific buffer status and statistics will yield better control over managing the system, which is highly desirable.

In effect, buffer management is like any other reentrant library that needs to be used by individual incarnations. It needs to be structured to handle incarnation-based status and statistics, apart from providing reentrant code.

Timer management in protocol stacks often involves a timer task that manipulates a timer list based on obtaining a system tick and notifying the appropriate protocol task on a timer expiry. The common implementation of the timer list is a linked list to which protocol tasks add entries based on new timers started. In an incarnation-based environment, the timer routines need to be modified to handle multiple incarnations. This is quite straightforward in that the routines can take an incarnation number as a parameter or a pointer to a "root" timer control block.

System and sizing issues

How does the incarnation-based sizing and initialization happen? One approach is to statically assign data structures assuming the maximum number of incarnations that the system can support. For buffer pools, this can also mean allocating and linking the buffers. The obvious problem with this approach is similar to the problem with generic static allocation. Depending on the system configuration and installation, not all incarnations need to be up at the same time. Moreover, in some ISP PoP environments, incarnations may need to be brought up dynamically. Allocating data structures and buffers to handle this worst case scenario is not advisable.

Listing 2: Incarnation initialization code

IncarnationRootBlock *
InitializeIncarnation(void)
{
    IncarnationRootBlock *pIncarnationRoot;
    pIncarnationRoot = malloc(sizeof(IncarnationRootBlock));
    pIncarnationRoot->pBufferPoolRoot = malloc(sizeof(BufferPoolRootBlock));
    pIncarnationRoot->pTimerRoot = malloc(sizeof(TimerRootBlock));
...
return
(pIncarnationRoot);
}

The solution is to have an initialization scheme similar to that shown in Listing 2. This pIncarnationRoot pointer is stored in the translation table and an incarnation number is given to the appropriate calling routine for further processing. In this way, the data structures related to the incarnation are allocated only when the incarnation is required. The IncarnationRootBlock can have sections that are protocol-specific (for example, pointers to routing and statistics tables) and system-specific (for example, pointers to buffer and timer data structures).

In the earlier discussion, we had indicated that having a task per incarnation would create scaling problems. The scaling problems would be due to memory required for contexts on a per task basis as well as the performance hit due to context switches. Another argument is that since the incarnations would need to have the same priority (one VR cannot have precedence over another), a single task handling multiple incarnations would be equivalent. The trade-off here is complexity and some impact on performance due to the translation/indirection for accessing incarnation-specific data structures.

Design for reentrancy

Reentrant code is very important in protocol stacks, since multiple incarnations are often required. Designing for reentrancy via incarnation-based data structures and libraries at the start of the development is much easier than trying to retrofit a non-reentrant design.

T. Sridhar is the director of engineering at Future Communications Software. He has worked in the communications software industry for over 13 years. He holds an MS degree in electrical and computer engineering from the University of Texas at Austin. He can be reached at sridhar@futsoft.com.

References

1. Ganssle, Jack. "Reentrancy," Embedded Systems Programming, April 2001, p. 177.
Back

2. Ould-Brahim, Hamid et. al, "Network-based IP VPN Architectures Using Virtual Routers,"
Back

Return to November 2001 Table of Contents

Loading comments...