Making packet processing more efficient with a network-optimized multicore design: Part 1

Cristian F. Dumitrescu

February 4, 2010

Cristian F. Dumitrescu

The role of the operating system
It is standard practice to have the cores allocated to control plane/application layer running under the control of an operating system, as these tasks do not have any real time constraints attached to them with regard to packet processing. In fact, the complex processing which has to be applied and the need to reuse the existing code base make the interaction with the OS a prerequisite.

On the other hand, there are strong reasons to discourage the use of an OS for the cores in charge of the data plane processing. First of all, no user is present, so there is no need to use an OS to provide services to the user or to restrict the access to the hardware. One of the important OS roles is to regulate the user's access to hardware resources (e.g. device registers) through user space / kernel space partitioning.

Typically, the operating system allows the user application to access the hardware only though a standardized API (system calls) whose behavior cannot be modified by the user during run-time.

The user does not need to interact directly with the fast path, as the packet forwarding takes place automatically without any need for user's run-time input.

The user might influence the run-time behavior of the fast path indirectly through interaction with the control plane by triggering updates of the data structures shared between the fast path and the slow path, but the code that updates these data structures is typically kernel code running on the control plane cores, which does already have full access to hardware.

Secondly, the main functions typically handled by an OS are not required:

1) Process management is not required, as there is only one task, which is very well defined: the packet forwarding. Even if a programming model with several tasks synchronizing between themselves would be imagined, the cost of task scheduling in terms of processor cycles would be prohibitively expensive and would severely impact the packet budget with no real value added in return.

2) Memory management is usually very simple, as it relies on the usage of pre-allocated buffer pools with all the buffers from the same pool having the same size. There is usually no need to support the dynamic allocation/release of variable size memory blocks, as implemented by the classical malloc/free mechanism.

3) File management is not required, as typically there is no file system.

4) Device management is usually done through the use of low-level device API functions. The set of existing devices is fixed (network interfaces, accelerators) and small, there is no need to discover the peripherals at run-time or to support hot-pluggable devices.

As there is little commonality among the fast path devices, there is little practical gain in implementing a common device interface or a device file system.

Sometimes, out of pure convenience or due to the need to support legacy code, an OS might also be used for the data plane cores. In this case, it might be useful to use a mechanism called para-partitioning to create two separate partitions for the control plane and the data plane respectively.

This mechanism requires firmware support to partition the resources of a single physical system into multiple logical systems while still maintaining a 1:1 mapping between the logical and the physical resources. Each partition boots its own OS which is aware only of the resources statically assigned to it by the firmware.

< Previous
Page 2 of 3
Next >

Loading comments...

Most Commented

  • Currently no items

Parts Search Datasheets.com

KNOWLEDGE CENTER