In order for Linux to be a true alternative to traditional real-time operatingsystems, its lack of determinism must be dealt with. Real-time extensionshave recently made this an easy problem to solve.
While market analysts and others focusing on the business side of computers have become aware of the growing importance of Linux, a secondary market exists with potentially just as much impact: real-time extensions for Linux. Indeed, engineers designing embedded systems have come to embrace Linux as a genuine alternative to more traditional real-time operating systems. This two-part series examines what's involved in working with real-time Linux and is based on the experiences of someone who has devoted the past year to writing data-acquisition drivers that run under that environment.
A bit of history
More than a decade ago, I was developing software for an ultrasound scanner. Back then, in the era of the 80286, MS-DOS was almost the only OS suitable for embedded PC development. It offered everything we needed, as well as being clear and straightforward to work with. Indeed, the only limits designers faced came from the PC hardware. Even today, DOS executes any interrupt immediately, and nothing can interrupt the code. The program's timing was sufficiently precise, and all system functions adapted well enough to use on an embedded platform.
At that time, real-time systems were quite simple and typically included one main loop, as shown in Listing 1. This simplicity, however, also brought along a few problems. Unavailable were the OS services we take for granted today, such as networking, database accesses, file-system operations, graphics, and running the user interface. Writing any of those services from scratch, as well as supporting them for at least a couple devices, requires enormous engineering effort. In addition, systems were totally unscaleable, and managers accepted their use only because of the high cost of an off-the-shelf PC at the time. Despite these drawbacks, PC-based systems addressed the main requirement of a hard real-time system: guaranteed timing deadlines that the system can't miss under any circumstances.
Then Windows came along to ease the burden on the user who didn't want to work at the command line. However, its new structure and scheduling mechanism added a large degree of uncertainty as to when a given program might run. Thus, in response to the needs of embedded programmers, the number of dedicated real-time OSes mushroomed. Many of them grew from the simple main-loop concept, but they unfortunately inherited the limitations of their ancestors. Other full-service RTOSes are well-crafted and thoroughly thought out, but to this day they remain somewhat limited with regard to what applications and tools are available to developers. Certainly, one of these single-source solutions could, at times, prove advantageous for completing a project, but in many cases you must write a significant part of the project's individual components yourself without the support of large developer community.
In contrast, working with a general-purpose OS significantly reduces development and deployment costs as well as time to market. You can draw from a wide variety of off-the-shelf tools, services, example programs, and complete applications. This situation is especially true for a Unix-like OS, and open-source options such as Linux have proven especially attractive. Indeed, many established vendors of real-time software are migrating rapidly toward embedded Linux, examples being QNX and Lynx Real-Time Systems; Lynx even went as far as changing its name to LynuxWorks. In addition, newer players, such as RedHat, MontaVista, TimeSys, and Lineo provide tools and OS code for embedded real-time applications. Finally, even the establishment is giving credibility to the movement as IBM, HP, Motorola, and 3Com have aligned themselves behind the growing Embedded Linux Consortium.
Meanwhile, it's now possible to shrink an image of embedded Linux to fit within low-end targets as small as 4KB (according to Michael Tiemann, chief technology officer at RedHat). In addition, Linus Torvalds, who wrote the original Linux code but who is now working with mobile-processor maker Transmeta, is adding power management and a compressed file system to embedded Linux to help make it even more suitable for battery-powered mobile systems where programs must squeeze into a tiny footprint. These changes make Linux an increasingly appealing choice for designers selecting an embedded OS.,
Real-time Linux architecture
Now consider the technical reasons behind the growth of real-time enhancements for Linux. The key fact is this: design decisions that are brilliant for a general-purpose OS are lethal for a real-time one.
A general-purpose OS can't operate in real time because its designers must achieve good performance for multiple applications running at the same time-but without performance optimized for any particular one of them. Further, major OS advances such as virtual memory, large caches, hardware-request reordering, and optimization hinder, rather than help, with real-time performance. For multiprocess systems minimizing context-switching time is important. And while coarse-grained schedulers move towards that goal, they make it more difficult, if not impossible, to run a time-critical process on time.
Let's look at OS developments from the opposite perspective. One way to improve real-time performance is to add extra preemption points where the OS can stop execution of one process and give time to a critical one. However, this approach decreases overall performance in a multiprocessing system; designers tune general-purpose OSes for best overall average performance, making worst-case behavior non-deterministic.
The solution to the problem comes with an understanding of the dissimilar nature of these two major classes of operating environments. By decoupling the real-time part of an OS from the general-purpose kernel, it's possible to optimize the real-time part separately to meet timing deadlines while allowing the rest of the system to show the best-possible performance. This approach is exactly what the creators of RTLinux (www.rtlinux.org) and RTAI (www.rtai.org) did when they developed their real-time extensions. Both of these products are available at no cost under a general public license, meaning open source and free. Although these two implementations of real-time Linux differ somewhat (see sidebar, “Subtle differences in real-time implementations: elegance vs. practicality”), they both operate in a similar fashion. For consistency, the examples in this article use the RTLinux API.
|Subtle differences in real-time implementations: elegance vs. practicality
Prof. Paolo Mantegazza started the RTAI project based on Victor Yodaiken's RTLinux v. 1. Since then, RTLinux and RTAI have gone through long development paths on their own. Despite the fact that they're not API-compatible, their functionalities are very similar. All key primitives and services exist in both packages. Both offer:
In my opinion, RTAI provides a more practical API while RTLinux is more elegant. On the other hand, RTAI is more elegant in how it integrates into the Linux kernel. The RTAI team makes a constant effort to add features that people ask for, and thus its API has grown to become reasonably extensive. For example, RTAI includes clock (8254 and APIC) calibration, dynamic memory management for realtime tasks, LXRT (Linux Extension for Real Time) to bring soft/hard real-time capabilities into user space, remote procedure calls, and mailboxes.
The RTLinux team aims to keep their real-time Linux extensions as predictable as possible, adding only features that won't hurt designs and compatibility in the future. In short, the RTLinux API is more consistent, but many practitioners prefer to use RTAI.Thanks to ongoing competition, almost everything you can accomplish in one package you can do in the other. This same competition also encourages dramatic improvements in both products. The latest versions, RTLinux 3.0 and RTAI 1.6/24.1.3, are excellent mature products, and their authors have done everything possible to smooth out the learning curve.
It's fair to say that VenturCom's RTX and Radisys' (now TenAsys) INtime for Windows NT are also based on the concept of splitting real-time and nonreal-time tasks, and they do show excellent results. However, they are closed-source and relatively expensive in terms of tools and royalties.
In both cases, the real-time kernel inserts a very thin layer-perhaps just a hundred lines of code-between the interrupt-control hardware and the Linux kernel (Figure 1). When Linux issues a request to enable or disable an interrupt, the real-time kernel sees the request first and thus controls all interrupt vectors from the start. Instead of dealing with actual interrupt-control hardware, however, the real-time kernel writes the request into an internal data structure and returns control to Linux.
Thus Linux is completely isolated from the interrupt-control hardware. Instead, the real-time kernel emulates that hardware with a virtual machine layer. Now any incoming interrupt first invokes a routine in the real-time kernel, which checks whether a real-time handler is registered for this interrupt. If it finds one, it passes control to that handler. If no handler is registered or if it shares this incoming interrupt with the Linux kernel, the real-time kernel invokes nonreal-time Linux handlers that can also use system resources.
The advantage of this approach is that Linux becomes the lowest priority task for the real-time kernel. You can think of the real-time Linux kernel as a small real-time OS that can suspend Linux's execution at any state. It doesn't care what Linux is doing the moment an interrupt arrives; it immediately switches context and passes control to a real-time task.
The core of real-time Linux is very thin-it contains only a hundred or so lines of code-because it handles only interrupt processing. Furthermore, that kernel doesn't share any system resources with Linux. It doesn't require dynamic memory allocation, a file system, or spin locks to access any data structure. That's why it needn't wait for Linux, but instead Linux waits for all higher priority real-time tasks. This scheme provides a very predictable way to extend real-time capabilities to a general-purpose OS.
Any real-time kernel should be transparent, modular, and extensible, and real-time Linux meets these requirements. It's invisible (transparent) because if an application doesn't use the real-time extensions it has no way of even knowing that real-time Linux exists on the system. As for modular, you'll see in the next section that during installation, you can select which modules to load depending on the application or simply remove any of them from the startup script. Finally, if you need additional functionality, you can load extra modules including ones you've written yourself.
There are two ways to install a real-time Linux system. First, you can purchase a preconfigured embedded Linux distribution, such as the Hard Hat version from MontaVista Software or Yellow Dog Linux by Terra Soft Solutions, and follow the installation instructions. An alternative is to obtain a real-time kernel and add it to a commercial RTLinux installation by yourself. In the belief that many people will start from a standard Linux distribution, let's take a closer look at that second option.
Assume you're already running a commercial distribution of Linux with kernel 2.2.13 or later. Next download some real-time extensions (this article focuses on RTLinux v. 2.3, available at www.rtlinux.org) and patch it into that kernel. I prefer to put the RTLinux installation into the directory /usr/src/rtlinux-2.3. Then simply follow instructions in the text file install.phil.
Installation requires recompiling and installing a new kernel. Read the installation instructions closely because it's critical that you use the version of the Linux kernel for which any particular real-time extensions were written. It's also important to run the correct compiler (in this case, gcc 126.96.36.199 or later; I compiled everything in this article using gcc 188.8.131.52). Unfortunately, if you use an incorrect kernel version or an old compiler, RTLinux could fail without any notifications. I strongly suggest you get a clean kernel without modifications from one of the commercial distributors. One good source is www.kernel.org.
Make sure you write down the current module's configuration before enhancing the Linux kernel with real-time extensions. After you finish installation, that document serves as a baseline for troubleshooting if any services won't start up. An alternative is to copy the .config Linux kernel configuration file into a new kernel source tree, but this method can potentially bring incompatibility problems: newer versions of config or menuconfig might later use incompatible formats for the .config file. Whether you use the command make config or make menuconfig to set up the new configuration, it's important to enable the following options in the Linux kernel configuration file: symmetric multiprocessing support, hard real-time support (in processor type and features), and loadable module support.
Don't delete the old kernel from the /boot directory, and keep its respective record in lilo.conf. If something goes wrong during installation you can always press
After installing the real-time extensions and recompiling any required modules, it's a good idea to verify that everything went as expected. You can, for instance, try to run code from the /examples directory. Those programs automatically call the instrtl script to load RTLinux modules. Later, if you don't need real-time support, you can run the rmrtl script, which removes modules from the Linux kernel.
As we've just discussed, real-time Linux features a modular design that allows you to load only desirable portions of its functionality into the kernel space memory. This modularity makes it easier to fit real-time Linux in an embedded platform with tight memory requirements.
You have the choice of five primary and three additional RTLinux modules when deciding which to load into the kernel:
- rtl_time.o-controls processor clocks
- rtl_sched.o-implements a real-time scheduler
- rtl_posixio.o-provides a POSIX-like interface to device drivers
- rtl_fifo.o-creates a real-time non-blocking FIFO implementation between real-time modules and user-space processes
- mbuff.o-provides a shared memory between real-time tasks and user-space processes
- rtl_ipc.o-provides POSIX-style blocking mutexes and semaphores
- rtl_debug.o-adds support for a source-level debugger
- rtl_com.o-interface with serial ports
For most real-time tasks, only the first four modules are necessary. Note that RTAI provides similar modules but with different names.
Build your first real-time module
Let's now try to write, compile, and run a simple RTLinux application. This example implements a hard real-time periodic task that toggles the state of Line 0 on a PC's parallel port 1,000 times a second.
Before doing any coding, be certain to understand that a RTLinux task itself is a kernel module.,, You load this task module only after you've loaded the RTLinux modules. In this way, the loader can now link functions in your task module to entry points in the real-time Linux modules. Further, because it's a kernel module, your task should have at least two predefined entry points: int init_module(void) and int cleanup_module(void).
Also assuming for the moment that the task code is already written, take a quick look at how easy it is to compile into the RTLinux task module, which you then load into the real-time kernel. One method is to quickly write a makefile script that creates the module, which here we've named pp_flip.o (Listing 2).
That procedure isn't really difficult, but there's an even easier way to compile an RTLinux module, especially if it's written all in C. During the RTLinux installation process, the makefile script for the initial step automatically generates a file known as rtl.mk, which contains all the required compiler options and paths for a specific installation of RTLinux. Now all you need do is find that file, copy it into the directory with the source file for the real-time task, and type make -f rtl.mk pp_flip.o.
Now let's move on to the source code for the task in pp_flip.c. Start with the required headers (in Listing 3) and note that by including rtl_sched.h you automatically include rtl_conf.h, rtl_core.h, and rtl_time.h. These files define functions from several vital RTLinux modules including rtl_time and rtl_sched.
Next write the init_module() entry point (Listing 4), which the modutils program calls when it inserts that module into the kernel. Later, when it actually runs, init_module() initializes thread attributes and prepares scheduler parameters. And although sched_get_priority_max(policy) doesn't use the Policy argument in the current RTLinux release, I recommend that you set it to SCHED_FIFO for compatibility with future versions.
|When do you really need hard real-time performance?
The difference between hard and soft real-time is that hard real-time performance always guarantees exact timing. Two major groups of application need hard real-time:
Good examples are robotics, animatronics, industrial equipment, digital models of analog systems, neural, fuzzy, adaptive systems, medical/biological test equipment, test-rig control and monitoring, and real-time simulation with hardware in the loop. If your application doesn't fall into one of these categories, don't bother with RTLinux. Adding the real-time environment might degrade the overall performance of the OS, make mouse response jerky, and make the keyboard unresponsive.
At this point the code calls a well-known POSIX-style function named pthread_create(), which is part of the RTLinux API. The first parameter in that call holds a pointer to the real-time task's thread structure; the second parameter gives thread attributes; the third defines the thread function; and the fourth is the thread argument. With that last argument you can pass any 32-bit value or pointer (to a structure). This example doesn't use that parameter, but you could, for example, specify the frequency that way.
The function cleanup_module() has the opposite purpose. The kernel calls that function when it wants to unload any module. In this example, it kills the real-time task's thread that init_module() created. Please note the _np suffix in some of the functions in Listing 4, for example, pthread_delete_np(). This suffix implies “non-portable” or “non-POSIX” and means that a particular function is proprietary to RTLinux.
The heart of our example's module is pp_thread_ep(), the actual thread function. RTLinux calls it almost immediately after making the pthread_create() call. The OS executes that function as long as the thread is alive or until execution reaches the return statement. At the very beginning of the thread function you should call pthread_make_periodic_np() to tell the RTLinux scheduler that you want to make this thread periodic and execute it at exact time intervals. The first parameter is a pointer to the thread (in this case we use pthread_self() to get it), the second one is the start time (here, immediately), and the third parameter is the value of the period in nanoseconds.
Upon making the pthread_make_periodic_np() call, RTLinux marks the thread as periodic and ready for execution. The RTLinux scheduler starts thread execution at the designated start time and runs it until it passes the period interval. If you set the period to 0 the thread executes only once. Otherwise it continues to run in an infinite loop. First it flips Line 0 on LPT1 and then calls pthread_wait_np(). This function hands over system control from the thread to RTLinux scheduler, which hands control back to the thread at the start of the next period.
It's time to test the program. You'll start with two utilities from the modutils package. Specifically, insmod loads a module into the running Linux kernel and resolves symbolic links; rmmod removes a module from the kernel and cleans up symbolic links.3 During all insmod/rmmod operations make sure you're logged in as the root or become a super-user.
Now connect an oscilloscope to Pin 0 of LPT1 (see sidebar, “Tracking timing and worst cases”). Compile the real-time module with the command line make -f makefile.pp_flip and insert it into a running RTLinux kernel with the command insmod pp_flip.o. You should see a message from the init_module() routine, and the OS also logs that message into a file at /var/log/kern.log. Finally, to stop execution of the real-time module, issue the command rmmod pp_flip. The kernel calls the cleanup_module() entry point before it removes the module from kernel memory.
|Tracking timing and worst cases
When writing real-time code, you should design algorithms to ensure their deterministic behavior. Don't use recursions or heuristic algorithms without worst-case limiters. Never create blocking calls in a real-time task. There are several ways of tracking timing and worst case performance. One is to call rt_get_time(), which returns time in ticks. Flush the result of this function into one of the realtime FIFOs, then read and store the stream in user space, for example cat /dev/rtf0 > ticks.dat. Then write a small script to analyze this file for the time difference between ticks.
Another way to analyze the performance of the example program from this series of articles is to attach a scope with a deep memory to the parallel-port pin. Switch the scope into envelope mode with the maximum number of samples available. Add #define LP_PORT 0x378 into your program, and each time you get into an important place of your realtime task, flip one or another bit of the parallel port with a command similar to outb(value, LP_PORT);. The shape of the signal on the scope display clearly shows jitter and latency variations.
To find a worst-case condition you could also attach a logic analyzer to the same parallel port and set up logical conditions to trigger when delay between strobes exceeds a predefined value.
At this time it's prudent to heed a warning that strongly advises you against using a method of checking program execution that's common among many programmers. Specifically, it's not safe to call printk() from RTLinux threads or handlers. Instead use the RTLinux-safe log function rtl_printf() to write messages into the syslog file.
Talking with the hardware
Besides the real-time operating system, another key component of a data-acquisition system is the device driver that controls the digitizing hardware. This discussion assumes that you're working with a data-acq card that sits on either the PCI or PXI bus and includes a FIFO to buffer analog I/O operations. It also assumes that you already have a device driver for that board written to run under a commercial Linux distribution. (Consult the references list at the end of this article for more on non real-time Linux device drivers.)
With this driver loaded, the real-time module should be able to communicate directly with the data-acq hardware. However, it's generally wise to split a complex real-time task by using separate instances of the data-acquisition device driver for real-time and non-real-time tasks. You can write, debug, and test pieces of data-acq code in the user space and only afterwards port them into the real-time environment.
Another issue to consider is how the driver should interface to a calling program. When insmod inserts a module into a running Linux kernel, it resolves the symbolic name of that module's entry points into pointers to the module's functions. It also inserts all symbolic names marked as “export” into the symbol table. For example, after you load the rtl_fifo.o module, all exported functions within it become available to other modules.
Real-time Linux tasks and the Linux kernel run in the same address space. And if your data-acq driver exports needed entry points, the real-time module can use them. Of course, be sure to insert the data-acq driver into the Linux kernel before doing the same for the real-time module.
This approach has pros and cons. If a real-time module calls driver functions directly, it's difficult to avoid synchronization issues or guarantee that a second task won't call the same board in the middle of communications with the first task. Another problem arises because driver entry functions usually represent a lower-level interface than one exposed by the shared library. And finally, working with a large number of exported functions might cause namespace pollution. This effect could slow down insmod or rmmod operations and, as a rare but real threat, compromise kernel stability. Making direct calls, however, is the fastest method to talk to the hardware.
Another method is to provide a POSIX-like interface for RTLinux. This interface consists of the well-known set of “twelve magic functions” (including open(), close(), read(), write(), ioctl(), and others), but now you call them from the real-time task (Figure 2). This method brings many advantages compared to calling direct entry points. For instance, the use of standard entry points won't pollute the namespace. You also get full control over who is calling the driver and can easily address synchronization issues.
To use the same driver from both the real-time task and a user process, you must define and register two sets of I/O entry points (Listing 5), making sure you define a symmetrical set of functions for both OSes. First define them for the Linux driver (file_operations ln_pd_fops) and then for the RTLinux driver (rtl_file_operations rtl_pd_fops). (You're actually working with the same module, but real-time Linux and Linux format entry points a bit differently.) Then sequentially register both the RTLinux driver and Linux driver. I recommend using different major numbers for those I/O operations to avoid confusion. (An excellent example of how to use the same driver for real-time and user tasks ships with RTLinux; it's the file rtl_fifo.c, written by RTLinux co-author Michael Barabanov; you'll find it in the /examples/fifos directory.)
The downside of this approach is that it takes extra programming effort. Fortunately, you can cut down on this effort by creating an OS Abstraction Level (OSAL), as in Figure 3. Here you isolate the hardware-specific part of the driver from the OS-specific portion. Instead of calling the OS directly, all calls to the driver come through the OSAL, whether the driver needs to talk to hardware or make some system calls. For example, if the driver must copy a portion of memory to return results, it calls the OSAL function osal_memcpy(). If a real-time task made the driver call, OSAL in turn calls memcpy(). If the call originated from the user space, OSAL instead calls the copy_to_user() function.
Note that the I/O entry points are different for a real-time task and a user-space process. An OS-specific interface performs the initial processing of an I/O request and then calls the main I/O dispatch routine (Listing 6). The common part of the read() entry point doesn't use anything specific to RTLinux or any other OS. The common part of each POSIXIO entry point is a separate function.
Before starting to write a device driver, you must decide whether it's necessary to support real-time and user-space calls at the same time. For most embedded applications you need either one or the other and thus can simplify the overall design with a conditional compilation. For example, Listing 7 shows the osal_memcpy32() function just discussed. If you must compile the driver for use from a real-time task, simply define _NO_USERSPACE and recompile the driver. Otherwise it copies data into the user space.
On some occasions, though, you might want to be able to communicate with the same driver from both a user process and a real-time task. For example, an application might require two data-acq boards of different types that share the same driver in one system; the user app runs one card, and the real-time task controls the second card. Obviously, such a driver must support calls from both tasks at the same time. The easiest way to achieve this goal is to mark the origination of any I/O requests and process them with respect to the caller. However, be very careful to avoid race conditions.
If you're writing a device driver instead of working with one that ships with the data-acq hardware, it's also important to realize that a real-time Linux driver must follow more strict limitations then a Linux driver. For instance, when a real-time task makes a call, the driver shouldn't do any of the following:
- Allocate or free memory
- Use printk() or similar I/O routines
- Make blocking calls
- Copy data to or from the user space or somehow invoke the VMM (virtual memory manager)