Last month, the author introduced you to the basics of TCP/IP. In this second installment, he discusses the details of putting TCP/IP into a resource-constrained embedded system.
Go to Part 1 of this article, “Introduction to TCP/IP”
Embedded systems have inherited the programming practices used in larger systems. Network protocols, and TCP/IP in particular, incorporate programming practices used in larger systems. As discussed in the first part of this article, the history of TCP/IP is one of adapting and modifying the original sources written at the University of California at Berkeley to embedded systems. The Berkeley stack is the basis for most of these ports and is the basis of most of the commercial TCP/IP stacks for embedded systems. Of course, real-time and embedded systems face many issues that are unique. A straight port of the Berkeley stack is not the best implementation for the particular needs of an embedded and real-time system.
Most vendors have modified the Berkeley code over the years to improve the performance of the stack in embedded systems. Any ports or modifications of the original Berkeley sources should address the following issues. This rule applies to commercial canned stacks, as well as home-grown porting jobs. Certainly, if you’re purchasing a TCP/IP stack, you will want to verify that your vendor has taken the following things into account:
Buffer management. The TCP/IP mbuf buffer management should be able to use pre-allocated buffers rather than allocating them from the global heap at run time (via malloc ).
Timers. The timers used in the protocols for connection management, timeouts, and retries should be managed by the RTOS. They should not be a separate implementation that will secretly steal bandwidth from the CPU or cause concurrency problems.
Latency. If an RTOS is present, it should not add any additional latency. Interrupt-handling interfaces should be fast and deterministic. The RTOS should not add any latency to the interrupt processing required with the physical transmission and reception of a frame. The large amount of context switches and CPU processing required in dealing with a packet increases the importance of using an OS with minimal thread context switch time.
Concurrency. All buffering mechanisms should have semaphore protection to allow higher performance potential in real-time systems. The first TCP/IP protocol implementations were on Unix systems and depended onproblems. Semaphore protection should be available to the timers to reduce concurrency problems.
Minimized data copying. The TCP/IP implementation should minimize the amount of data copying. The data within each frame can be maintained in the same buffer so it doesn’t need to be copied and re-copied by the CPU at each stage of the protocol. The networking chip’s DMA places the packets directly in the managed buffer pool where the packet is passed up through the stack by manipulating pointers and not by copying data. Also, some vendors have extended the mbuf mechanism to allow the data to be shared between mbufs and mblocks where there are STREAMS protocols also present in the system.
Link layer multiplexing. Protocol implementation requires a framework with mechanisms for queueing and buffer management. Also, modern protocols require more flexible device driver interfaces and more flexible multiplexing. This is particularly true where serial point-to-point protocols such as PPP are now extended to support IP tunneling and Virtual Private Networks (VPN). The original Berkeley implementation isn’t sufficiently flexible to meet all of these needs. The better protocol stack implementations use a framework that allows the stack to be extended as new protocols and interfaces are developed. This can be accomplished by extending the basic Berkeley driver interface scheme, or the protocols can be rewritten to use a different framework.
CPU bandwidth. Each embedded system application has different requirements for its TCP/IP stack. For example, a TCP/IP stack in most Internet appliances probably would not be considered real time. Also, if the network is used for control and management functions, the hard bandwidth requirements will be fairly low. On the other hand, if the application is streaming video or voice, the faster packet rates would qualify the application as a real-time application.
Link layer interfaces and device drivers
As I’ve described, the OSI model shows an interface between the physical and data link layers. In actual implementation, this interface is implemented in several ways.
BSD 4.3 . Most of the examples in this article show the BSD 4.3 type of structures and interfaces. The traditional Berkeley stack could multiplex between multiple interfaces if they were using a common IP stack. Originally, it could only interface cleanly with link layers that were compatible with Ethernet and only among IP and its related protocols, such as ARP and RARP. Subsequently some implementors have hacked the BSD code to allow it to be used with serial interfaces such as SLIP (Serial Line Interface Protocol) or PPP. This was generally a somewhat kludgy way to make them look like Ethernet and as such, was not particularly efficient. Also, each vendor of a TCP/IP stack for embedded systems has put its own nuances in the interface mechanism as well.
The BSD 4.3-compatible stacks uses the ifattach() function and the ifnet structure. This structure and its associated attachment mechanism were inherited by most ports of the Berkeley stack used in embedded systems. Typically, the network device driver initialization function is specific to a particular OS. When the device specific initialization function is called, it first allocates space for its own internal data structures, often called a “softc” structure. This data structure generally includes space for the ifnet structure. It determines its own MAC address by reading it from the hardware. Then it fills in the fields in the ifnet structure. It sets the device-specific information such as the MTU, the MAC address, and the if_name and if_unit . It then fills in the function pointer fields with pointers to the driver’s interface functions. Once this structure is appropriately initialized, the initialization code calls the if_attach() function with a pointer to the ifnet structure as an argument.
The ifnet structure is illustrated in Table 1. The ifnet structure contains, among other things, a pointer to a list of address structures for each interface. This address structure, called ifaddr , contains the interface’s MAC address and the broadcast address.
Data link provider interface . The data link provider interface (DLPI) is found in most implementations of STREAMS. The DLPI interface is not specific to TCP/IP and can be generalized for almost any protocol. It does, however, require that the protocols be implemented as STREAMS modules and the link layer be implemented as a STREAMS driver. It also requires 802.2-type framing to properly multiplex between the link layer and the network layer of the protocols bound to the link layer.
The legacy BSD interface between the link layer and the network layer doesn’t incorporate any mechanism for interfacing specifically to connection-oriented or connectionless link layers. On the other hand, DLPI provides support for both connectionless or connection-oriented link layer binding. The connection-oriented service would be available if the Service Access Provider (SAP) is capable of supporting connection-oriented transmission. Also, DLPI allows the data link SAP (DLSAP) to identify itself as a promiscuous SAP, that is, one that can grab all the packets on the net, not just those directed to it. See the sidebar called “Promiscuity” for more information.
Most network interfaces can be set to promiscuous mode so that all packets are received. For example, Ethernet interfaces can be set to receive all packets on the wire, including broadcast and multicast packets, as well as those that have someone else’s MAC address in the destination field. It is interesting to note that most protocols would work even if the interfaces were accidentally set for promiscuous mode. Packet filtering then takes place at higher levels of the stack, which is very inefficient but probably functional. Some software above the link layer may want to use promiscuous mode by telling the interface to receive everything and send it all up. An example of this is a software protocol analyzer such as the tcpdump or snoop utility found in Unix systems.At the transport layer, the API provides a promiscuous socket. Generally, you can’t stop processing and sending other information merely because the user wants to observe the network traffic. To provide the capability of receiving promiscuously while preserving other network functionality, the data link interface must support some sort of promiscuous binding. In other words, it has to provide a mechanism for the layer above to request all packets.
Fundamental to DLPI is the concept of the SAP. In Figure 1 you can see how the SAP identifiers are incorporated in the LLC framing. The network layer above the interface is a data link service user (DLSU) and the link layer driver is a DLSAP. DLPI uses a set of request primitives passed as messages from the DLSU to the DLSAP. In response, the DLSAP passes a set of acknowledgment primitives back to the DLSU.
Figure 2 shows the relationship between the SAP and the DLSU. DLPI manages the state of the relationship between the DLSU and the DLSAP with a state machine. Table 2 lists some of the common primitives likely to be used for TCP/IP connectionless link layer service and their responses.
Traditionally, various attempts have been made to allow drivers to support simultaneous interfaces to both Berkeley and STREAM-type stacks. This was done with a layer of glue code that would copy the messages from STREAMS mblocks to Berkeley-type mbuf s. A better mechanism should allow the actual data in the frame to be shared between the mblock and mbuf header structures.
A good extended multiplexing interface mechanism gives you a DLPI-type binding mechanism for binding a stack to an arbitrary link layer interface. It should do this while providing as much backward compatibility as possible with Berkeley-style implementations. As in DLPI, this advanced mechanism should allow multiple stacks to be bound to the same link level interface. It provides a generic mechanism for address resolution between MAC addresses and protocol addresses. This address resolution capability should still support the BSD ARP protocol previously discussed under the section about BSD 4.3. As with DLPI, the advanced interface has a state machine to keep track of the relationship between the stack and the link layer. It also provides a means of enabling multicast addresses on an interface. This type of mechanism was absent from BSD 4.3 and only partially present in BSD 4.4.
Hosting TCP/IP in your embedded system
Since the BSD stack has been available in source code form for many years, most people implementing TCP/IP for embedded systems have used it as a base. Hardly anyone implements his or her own protocol suite from scratch. You’ll have a number of fundamental choices if you want to put TCP/IP in a product for the first time, whether you want to use a commercial of TCP/IP or do your own port.
You may want to look at a number of factors before you decide which path to take for incorporating TCP/IP. You’ll want to ask yourself basic questions about the connectivity requirements in your design. For example, how sensitive is your project to unit manufacturing cost? Ask about the future of your design. What is its reuse potential? Is the project to be a platform for launching future projects? Following are a few broad categories of products used in connected embedded systems. Each of these categories suggests a different direction for your implementation of TCP/IP in your embedded design.
1. The embedded product is based on a legacy system. There is only a limited need for remote access. So far, remote access has only been available through a serial port. TCP/IP allows access to this serial port. Potential increase in manufacturing cost is not the determining factor when adding TCP/IP
2. The product should be relatively easy to maintain. A network connection is needed for remote management of the product
3. The networking connection is required for data capture and analysis
4. The product is a router, gateway, switch, broadband modem, or a similar product in which networking is a fundamental element
5. The product is a consumer device with a graphical display, such as a personal digital assistant. The ability to browse the web is a fundamental part
6. The device is associated with a measurement and control system. An easy method must exist for reaching the device from a PC with a web browser. An embedded web server is a fundamental requirement
Below I list some of the choices you have today for incorporating TCP/IP in your product.
Total hardware implementation
Roll your own stack with no RTOS
All in one