A decision-tree approach to picking the right embedded multicore software architecture
Decision 3: Determine the Control Plane and Data Plane ModelIf we go down the SMP side of the decision tree we must choose whether our SMP configuration will be “data plane” or “control plane” focused.
Data plane configurations are throughput intensive (e.g. packets per second) and usually need a light weight or real time operating system, or another light weight programming model for handling throughput requirements on the data plane side.
For performance sensitive applications where throughput is important, one such approach that is gaining popularity in the multicore space is “user space” application development. It is a framework of Linux user space drivers that allow customers to develop high-performance solutions (Figure 3).
Its high-performance stems from doing I/O that bypasses Linux kernel so no system calls needed. Application developers with their own software often like this model. Another advantage is keeping application software out of the kernel avoid GPL license contamination.
Click on image to enlarge.
Decisions 4 & 5: Choose the type of OS needed for the Control Plane and Data Plane
Data plane processing, in many cases, does not require an operating system. There is typically no requirement or need to provide services to a user or otherwise restrict access to the underlying hardware through a restrictive set of APIs.
In addition, fast path processing does not require direct intervention by the user as packet processing is done automatically.
Many of the other functions typically handled by an OS such as process management (there is usually only one task per core), memory management (pre-allocated buffers are used), file management (no file system), and device management (low level access functions and APIs are used). A data plane OS is used to support legacy code or to do some basic scheduling when the need arises. Choose a simple run to completion model or a RTOS if necessary.
It is common to have the multicore applications that are allocated to the control plane layer running under the control of an operating system. These applications typically do not have any real time latency or throughput constraints as it relates to packet processing.
Much of the complex processing required on the control plane and the need to reuse existing code bases makes the interaction with an OS a prerequisite. Linux is a common choice for an operating system for control plane processing as it has added increased support for SMP processing.
Some of the improvements include an adjustment to the way the kernel supports the file systems, a number of routing and device-handling optimizations, removal of the Big Kernel Lock (BKL) which should increase Linux performance on larger SMP-based systems, the ability to throttle input and output, improved power management, and upgrades to the CPU scheduler.
Decision 6: Determine the Type of Acceleration Needed
Multi-core network acceleration is necessary for packet processing. TCP/IP stacks are not designed to work well with multicore systems. Most network packet processing protocols can be broken down into two paths.
* Stateless path, also known as the data path, requires quick and efficient switching/routing of packets. This can be broken down into packet identification (classification) and forwarding.
* Stateful path, also known as the control path, requires more processing and has more inherent latency than the data path. The stateful control path requires 90% of the code and is used 10% of the time. The stateless data path requires just 10% of the code and is used 90% of the time (Figure 4).

Fast Path technology is used to accelerate the 10% of the code in the stateless path to increase packet processing performance.
Application Specific Fast Path (ASF) is a software based solution that stores flows requiring simple, deterministic processing in a cache. ASF recognizes cached flows and processes such packets in a separate highly optimized context (Figure 5).
ASF accelerates the data throughput for networking devices ASF in software provides optimized implementation for Data Path processing that is customized for platforms for achieving higher throughput for specific applications.
Click on image to enlarge.
It leverages functionality provided by hardware like hashing, checksum calculation, cryptography, classification, scheduling to provide higher throughput. The focus of ASF is to accelerate the processing of many relevant applications. Some examples include:
IPv4 Forwarding –Create an ASF forwarding cache. When packets match entries in the forwarding cache, the packets get forwarded at the driver level, without going through the Linux Networking Stack.
Firewall + NAT – Maintain a 5 tuple based session table. When packets match the session table, the packets can be scanned for vulnerabilities, have address translation performed and be forwarded.
IPsec – Maintain a database of associations from flows to SA (Security Association). When packets match the database, the packets are encrypted or decrypted and routed appropriately.
IP Termination– Accelerate the pre-configured locally terminated or originated flows. It can work in conjunction with PMAL- user-space Zero Copy Mechanism.


Loading comments... Write a comment