When Linux began life on an Intel i386 processor, no
one ever expected the success Linux would enjoy in server applications.
That success has led to Linux being ported to many different
architectures and used by developers for embedded systems from cellular
handsets to telecommunications switches. Not long ago, if your
application had real-time requirements, you might not have included
Linux among the choices for your operating system. That has all changed
with the developments in real-time Linux driven, in large part, by
audio and multimedia applications.
In this two part series, we start with a brief look
at the historical development of real-time Linux features. Then we look
at the facilities available to the real-time programmer and how these
facilities are used.
What Is Real Time?
Ask five people what "real time" means, and, chances are, you will get
five different answers. Some might even cite some numbers. For the
purposes of the discussion to follow, we discuss some scenarios and
then propose a definition. Many requirements might be said to be soft
real time, while others are called hard real time.
Soft Real Time
Most agree that soft real time means that the operation has a deadline,
but if the deadline is missed, the quality of the experience could be
diminished (but not fatal). Your desktop workstation is a perfect
example of soft real-time requirements. When you are editing a
document, you expect to see the results of your keystrokes immediately
on the screen. When playing your favorite .mp3 file, you expect to have
high-quality audio without any clicks, pops, or gaps in the music.
In general terms, humans cannot see or hear delays
below a few tens of milliseconds. Of course, the musicians in the crowd
will tell you that music can be colored by delays smaller
than that. If a deadline is missed by these so-called soft real-time
events, the results may be undesirable, leading to a lower level of
"quality" of the experience, but not catastrophic.
Hard Real Time
Hard real time is characterized by the results of a missed deadline. In
a hard real-time system, if a deadline is missed, the results are often
catastrophic. Of course, catastrophic is a relative term. If your
embedded device is controlling the fuel flow to a jet aircraft engine,
missing a deadline to respond to pilot input or a change in operational
characteristics can lead to disastrous results.
Note that the duration of the deadline has no bearing
on the real-time characteristic. Servicing the tick on an atomic clock
is such an example. As long as the tick is processed within the
1-second window before the next tick, the data remains valid. Missing
the processing on a tick might throw off our global positioning systems
by feet or even miles!
With this in mind, we draw on a commonly used set of
definitions for soft and hard real time. For soft real-time systems,
the value of a computation or result is diminished if a deadline is
missed. For hard real-time systems, if a single deadline is missed, the
system is considered to have failed, and might have catastrophic
consequences.
Linux Scheduling
UNIX and Linux were both designed for fairness in their process
scheduling. That is, the scheduler tries its best to allocate available
resources across all processes that need the CPU and guarantee each
process that they can make progress. This very design objective is
counter to the requirement for a real-time process. A real-time process
must run as soon as possible after it is ready to run. Real time means
having predictable and repeatable latency.
Latency
Real-time processes are often associated with a physical event, such as
an interrupt arriving from a peripheral device. Figure 17-1 below illustrates the
latency components in a Linux system. Latency measurement begins upon
receipt of the interrupt we want to process.
This is indicated by time t0 in Figure 17-1. Sometime
later, the interrupt is taken and control is passed to the Interrupt
Service Routine (ISR). This is indicated by time t1. This interrupt
latency is almost entirely dictated by the maximum interrupt off
time1—the time spent in a thread of execution that has hardware
interrupts disabled.
It is considered good design practice to minimize the
processing done in the actual interrupt service routine. Indeed,
this execution context is limited in capability (for example, an ISR
cannot call a blocking function, one that might sleep), so it is
desirable to simply service the hardware device and leave the data
processing to a Linux bottom half,2 also called softIRQs.
When the ISR/bottom half has finished its processing,
the usual case is to wake up a user space process that is waiting for
the data. This is indicated by time t2 in Figure 17-1 below. At some point in
time later, the real-time process is selected by the scheduler to run
and is given the CPU. This is indicated by time t3 in Figure 17-1.
Scheduling latency is affected primarily by the number of processes
waiting for the CPU and the priorities among them.
Setting the Real Time attribute on a process gives it
higher priority over normal Linux processes and allows it to be the
next process selected to run, assuming that it is the highest priority
real-time process waiting for the CPU. The highest-priority real-time
process that is ready to run (not blocked on I/O) will always run.
You'll see how to set this attribute shortly.
 |
| Figure
17-1. Latency components |
Kernel Preemption
In the early Linux days of Linux 1.x, there was no kernel preemption.
This meant that when a user space process requested kernel services, no
other task could be scheduled to run until that process either blocked
(goes to sleep) waiting on something (usually I/O), or until the kernel
request is completed. Making the kernel preemptable3 means that
while one process is running in the kernel, another process can preempt
the first and be allowed to run even though the first process had not
completed its in-kernel processing. Figure 17-2 illustrates this.
In this figure, Process A has entered the kernel via
a system call. Perhaps it was a call to write() to a device such as the
console or a file. While executing in the kernel on behalf of Process
A, Process B with higher priority is woken up by an interrupt. The
kernel preempts Process A and assigns the CPU to Process B,
even though Process A had neither blocked nor completed its kernel
processing.
 |
| Figure
17-2. Kernel preemption |
Impediments to
Preemption
The challenge in making the kernel fully preemptable
is to identify all the places in the kernel that must be protected from
preemption. These are the critical sections within the kernel
where preemption cannot be allowed to occur.
For example, assume that Process A in Figure 17-2 above is executing in
the kernel performing a file system operation. At some point, the code
might need to write to an in-kernel data structure representing a file
on the file system. To protect that data structure from corruption, the
process must lock out all other processes from accessing the shared
data structure. Listing 17-1 below illustrates
this concept using C syntax.
Listing 17-1
Locking Critical Sections
...
preempt_disable();
...
/* Critical section */
update_shared_data();
...
preempt_enable();
...
If we did not protect shared data in this fashion,
the process updating the shared data structure could be preempted in
the middle of the update. If another process attempted to update the
same shared data, corruption of the data would be virtually
certain. The classic example is when two processes are operating
directly on common variables and making decisions on their values.
Figure 17-3 illustrates such a case.
In Figure 17-3 below,
Process A is interrupted after updating the shared data but before it
makes a decision based on it. By design, Process A cannot detect that
it has been preempted. Process B changes the value of the shared data
before Process A gets to run again.
As you can see, Process A will be making a decision
based on a value determined by Process B. If this is not the behavior
you seek, you must disable preemption in Process A around the
shared data—in this case, the operation and decision on the variable
count.
 |
| Figure
17-3. Shared data currency error |
Preemption Models
The first solution to kernel preemption was to place checks at
strategic locations within the kernel code where it was known to be
safe to preempt the current thread of execution. These locations
included entry and exit to system calls, release of certain kernel
locks, and return from interrupt processing. At each of these points,
code similar to Listing 17-2 below
was used to perform preemption.
Listing 17-2
Check for Preemption a la Linux 2.4 + Preempt Patch
...
/*
* This code is executed at strategic locations within
* the Linux kernel where it is known to be safe to
* preempt the current thread of execution
*/
if (kernel_is_preemptable() && current->need_resched)
preempt_schedule();
...
/*
* This code is in .../kernel/sched.c and is
* invoked from those strategic locations as
above
*/
#ifdef CONFIG_PREEMPT
asmlinkage void preempt_schedule(void)
{
while (current->need_resched) {
ctx_sw_off();
current->state |= TASK_PREEMPTED;
schedule();
current->state &= ~TASK_PREEMPTED;
ctx_sw_on_no_preempt();
}
}
#endif
...
The first snippet of code in Listing 17-2 (simplified from the actual code) is
invoked at those strategic locations described earlier, where it is
known that the kernel is safe to preempt. The second snippet of code in
Listing 17-2 above is the
actual code from an early Linux 2.4 kernel with the preempt patch
applied. This interesting while loop causes a context switch via the
call to schedule() until all requests for preemption have been
satisfied.
Although this approach led to reduced latencies in the Linux system, it
was not ideal. The developers working on low-latency soon realized the
need to "flip the logic." With earlier preemption models, we had this:
The Linux kernel was
fundamentally nonpreemptable.
Preemption checks were sprinkled
around the kernel at strategic locations known to be safe for
preemption.
Preemption was enabled only at these
known-safe points.
To achieve a further significant reduction in
latency, we want this in a preemptable kernel:
The Linux kernel is fully preemptable
everywhere.
Preemption is disabled only around
critical sections.
This is where the kernel developers have been heading
since the original preemptable kernel patch series. However, this is no
easy task. It involves poring over the entire kernel source code base,
analyzing exactly what data must be protected from concurrency, and
disabling preemption at only those locations.
The method used for this has been to instrument the
kernel for latency measurements, find the longest latency code paths,
and fix them. The more recent Linux 2.6 kernels can be configured for
very low-latency applications because of the effort that has gone into
this "lock-breaking" methodology.
SMP Kernel
It is interesting to note that much of the work involved in creating an
efficient multiprocessor architecture also benefits real time. The SMP
challenge is more complex than the uniprocessor challenge because there
is an additional element of concurrency to protect against.
In the uniprocessor model, only a single task can be
executing in the kernel at a time. Protection from concurrency
involves only protection from interrupt or exception processing.
In the SMP model, multiple threads of execution in the kernel are
possible in addition to the threat from interrupt and exception
processing.
SMP has been supported from early Linux 2.x kernels.
A Big Kernel Lock (BKL) was used to protect against
concurrency in the transition from uniprocessor to SMP operation. The
BKL is a global spinlock, which prevents any other tasks from executing
in the kernel.
In his excellent book Linux Kernel Development
(Novell Press, 2005), Robert Love characterized the BKL as the
"redheaded stepchild of the kernel." In describing the
characteristics of the BKL, Robert jokingly added "evil" to its list of
attributes!
Early implementations of the SMP kernel based on the
BKL led to significant inefficiencies in scheduling. It was found that
one of the CPUs could be kept idle for long periods of time. Much of
the work that led to an efficient SMP kernel also directly benefited
real-time applications—primarily lowered latency. Replacing the BKL
with smaller-grained locking surrounding only the actual shared data to
be protected led to significantly reduced preemption latency.
Sources of Preemption Latency
A real-time system must be capable of servicing its real-time tasks
within a specified upper boundary of time. Achieving consistently low
preemption latency is critical to a real-time system. The two single
largest contributors to preemption latency are interrupt-context
processing and critical section processing where interrupts are
disabled.
You have already learned that a great deal of effort
has been targeted at reducing the size (and thus, duration) of the
critical sections. This leaves interrupt-context processing as the
next challenge. This was answered with the Linux 2.6 real-time kernel
patch, the subject of the second part in this series.
Next in Part 2, "The
Linux 2.6 real-time kernel patch."
Christopher
Hallinan, field applications engineer at MontaVista Softwware, has worked for
more than 20 years in assisgnments ranging from engineering and
engineering management to marketing and busisness development. He spent
four years as an independent development consultant in the embedded
Linux marketplace.
This chapter is excerpted
from the book titled "Embedded Linux
Primer: A Practical Real-World
Approach", by Christopher Hallinan.
Published
by Prentice Hall Professional, as part of the Prentice
Hall Open Source
Software Development Series
ISBN 0131679848; Copyright 2007 Pearson Education, Inc.