CMP EMBEDDED.COM

Login | Register     Welcome Guest IPS  Call for Abstracts
 

Exactly When Do You Need Real Time?
Where and why real time is required and the effect of the OS on real-time design



Embedded.com
Do most embedded projects still need an real time operating system (RTOS)? It's a good question, given the speed of today's high performance processors and the availability of patches for real-time Linux, Windows, and other general purpose operating systems (GPOSs). The answer lies in the very nature of embedded devices.

Devices that, in most cases, are manufactured in the thousands, or millions, of units. Devices where even a $1 reduction in per-unit hardware costs can save the manufacturer a small fortune. Devices, in other words, that can't afford multi-gigahertz processors or a large memory array. In the automotive telematics market, for instance, the typical 32-bit processor runs at about 200Mhz — a far cry from the 2Ghz or faster processors now common in desktops and servers.

In an environment like this, an RTOS designed to extract extremely fast (and predictable) real-time response times from lower-end hardware offers a serious economic advantage. Savings aside, the services provided by an RTOS make many computing problems easier to solve, particularly when multiple activities compete for a system's resources.

Consider, for instance, a system where users expect (or need) immediate response to input. With an RTOS, a developer can guarantee that operations initiated by the user will execute in preference to other system activities, unless a more important activity (for instance, an operation that helps protect the user's safety) must execute first.

Consider also a system that must satisfy quality of service (QoS) requirements, such as a device that presents live video. If the device depends on software for any part of its content delivery, it can experience dropped frames at a rate that users perceive as unacceptable - from the users' perspective, the device is unreliable. But, with an RTOS, the developer can precisely control the order in which software processes execute and thereby ensure that playback occurs at an appropriate and consistent media rate.

RTOSs Aren't "Fair"
The need for "hard" real time - and for OSs that enable it - remains prevalent in the embedded industry. The question is, what does an RTOS have that a GPOS doesn't? And how useful are the realtime extensions now available for some GPOSs? Can they provide a reasonable facsimile of RTOS performance?

Let's begin with task scheduling. In a GPOS, the scheduler typically uses a "fairness" policy to dispatch threads and processes onto the CPU. Such a policy enables the high overall throughput required by desktop and server applications, but offers no assurances that high-priority, time-critical threads will execute in preference to lower-priority threads.

For instance, a GPOS may decay the priority assigned to a high-priority thread, or otherwise dynamically adjust the priority in the interest of fairness to other threads in the system. A high-priority thread can, as a consequence, be preempted by threads of lower priority. In addition, most GPOSs have unbounded dispatch latencies: the more threads in the system, the longer it takes for the GPOS to schedule a thread for execution. Any one of these factors can cause a high-priority thread to miss its deadlines, even on a fast CPU.

In an RTOS, on the other hand, threads execute in order of their priority. If a high-priority thread becomes ready to run, it can, within a small and bounded time interval, take over the CPU from any lower-priority thread that may be executing. Moreover, the high-priority thread can run uninterrupted until it has finished what it needs to do - unless, of course, it is preempted by an even higher-priority thread. This approach, known as priority-based preemptive scheduling, allows high-priority threads to meet their deadlines consistently, no matter how many other threads are competing for CPU time.

Preemptible Kernel
In most GPOSs, the OS kernel isn't preemptible. Consequently, a high-priority user thread can never preempt a kernel call, but must instead wait for the entire call to complete - even if the call was invoked by the lowest-priority process in the system. Moreover, all priority information is usually lost when a driver or other system service, usually performed in a kernel call, executes on behalf of a client thread. Such behavior causes unpredictable delays and prevents critical activities from completing on time.

In an RTOS, on the other hand, kernel operations are preemptible. There are still windows of time in which preemption may not occur, but in a well-designed RTOS, those intervals are extremely brief, often in the order of hundreds of nanoseconds. Moreover, the RTOS will impose an upper bound on how long preemption is held off and interrupts disabled; this allows developers to ascertain worst-case latencies.

To realize this goal, the RTOS kernel must be simple and elegant as possible. And the best way to achieve this simplicity is to design a kernel that only includes services with a short execution path. By excluding work-intensive operations (for instance, process loading) from the kernel and assigning them to external processes or threads, the RTOS designer can help ensure that there is an upper bound on the longest non-preemptible code path through the kernel.

In a few GPOSs, such as Linux v2.6, some degree of preemptibility has been added to the kernel. However, the intervals during which preemption may not occur are still much longer than those in a typical RTOS; the length of any such preemption interval will depend on the longest critical section of any modules (for instance, networking, file systems) incorporated into the GPOS kernel. Moreover, a preemptible GPOS kernel doesn't address other conditions that can impose unbounded latencies, such as the loss of priority information that occurs when a client invokes a driver or other system service.

Mechanisms to Avoid Priority Inversion
Even in an RTOS, a lower-priority thread can inadvertently prevent a higher-priority thread from accessing the CPU - a condition known as priority inversion. When an unbounded priority inversion occurs, critical deadlines can be missed, resulting in outcomes that range from unusual system behavior to outright failure. Unfortunately, priority inversion is often overlooked during system design. Many examples of priority inversion exist, including one that plagued the Mars Pathfinder project in July 1997.

Generally speaking, priority inversion occurs when two tasks of differing priority share a resource, and the higher-priority task cannot obtain the resource from the lower-priority task. To prevent this condition from exceeding a bounded interval of time, an RTOS may provide a choice of mechanisms, including priority inheritance and priority ceiling emulation. We couldn't possibly do justice to both mechanisms here, so let's focus on an example of priority inheritance.

To begin, we must consider at how task synchronization can result in blocking, and how this blocking can, in turn, cause priority inversion. Let's say two jobs are running, Job 1 and Job 2, and that Job 1 has the higher priority. If Job 1 is ready to execute, but must wait for Job 2 to complete an activity, we have blocking.

This blocking may occur because of synchronization; for instance, Job 1 and Job 2 share a resource controlled by a lock or semaphore, and Job 1 is waiting for Job 2 to unlock the resource. Or, it may occur because Job 1 is requesting a service currently used by Job 2.

The blocking allows Job 2 to run until the condition that Job 1 is waiting for occurs (for instance, Job 2 unlocks the resource that both jobs share). At that point, Job 1 gets to execute. The total time that Job 1 must wait may vary, with a minimum, average, and maximum time. This interval is known as the blocking factor. If Job 1 is to meet any of its timeliness constraints, this factor can't vary according to any parameter, such as the number of threads or an input into the system. In other words, the blocking factor must be bounded.

Now let's introduce a third job - Job 3 - that has a higher priority than Job 2 but a lower priority than Job 1 (see Figure 1 below). If Job 3 becomes ready to run while Job 2 is executing, it will preempt Job 2, and Job 2 won't be able to run again until Job 3 blocks or completes. This will, of course, increase the blocking factor of Job 1; that is, it will further delay Job 1 from executing. The total delay introduced by the preemption is a priority inversion.

Figure 1 - Job 1 is waiting for Job 2 to complete an activity, when Job 3 preempts Job 2. This further delays Job 1 from executing.

In fact, multiple jobs can preempt Job 2 in this way, resulting in an effect known as chain blocking. Under these circumstances, Job 2 might be preempted for an indefinite period of time, yielding an unbounded priority inversion and causing Job 1 to fail to meet any of its timeliness constraints.

This is where priority inheritance comes in. If we return to our scenario and make Job 2 run at the priority of Job 1 during the synchronization period, then Job 3 won't be able to preempt Job 2, and the resulting priority inversion is avoided (see Figure 2 below).

Figure 2 - Job 2 inherits Job 1's higher priority, thereby preventing Job 3 from preempting Job 2. Job 3 no longer delays Job 1 from executing.

Partitioning Schedulers for Guaranteed CPU Availability
As discussed, RTOSs use priority-based preemptive scheduling to determine which task gets control of the processor. While this approach provides developers with an easy method to define scheduling priority, it does pose a problem: If a given task is even one priority level higher than another task, that higher-priority task has the power to completely starve the less-critical task of CPU time. For instance, let's say you have two processes, process A and process B, where A has a slightly higher priority than B.

If process A becomes swamped with work or becomes the target of a denial of service attack, it will lock out process B (as well as any other lower-priority process) from accessing the CPU. Simply put, priority-based scheduling offers no guarantee that lower-priority processes will access at least a fraction of the CPU. If a high-priority thread runs in a loop, it can prevent other threads from accessing the CPU and effectively freeze the entire system.

This inability to provide guaranteed CPU budgets can make it difficult to integrate subsystems from multiple development teams, since it allows tasks in one subsystem to starve tasks in other subsystems — a problem that may not become obvious until integration and verification testing. Lack of CPU guarantees also makes the system vulnerable to denial of service attacks and other malicious exploits.

To compensate for this flaw, some RTOSs offer a fixed partition scheduler. Using this scheduler, the system designer can divide tasks into groups, or partitions, and allocate a percentage of CPU time to each partition. With this approach, no task in any given partition can consume more than the partition's statically defined percentage of CPU time. For instance, let's say a partition is allocated 30% of the CPU.

If a process in that partition subsequently becomes the target of a denial of service attack, it will consume no more than 30% of CPU time. This allocated limit not only allows other partitions to maintain their availability, but can also ensure that the user interface (for instance, a remote terminal) remains accessible. As a result, operators can access the system and resolve the problem - without having to hit the reset switch.

There is a problem, however. Because the scheduling algorithm is fixed, a partition can never use CPU cycles allocated to other partitions, even if those partitions haven't used their allotted cycles. This approach squanders CPU cycles and prevents partitions from effectively handling peak demands. Systems designers must, as a result, use more-expensive processors, tolerate a slower system, or limit the amount of functionality that the system can support.

Adaptive partitioning
Another approach, called adaptive partitioning, addresses these drawbacks by providing a more dynamic scheduling algorithm. Like static partitioning, adaptive partitioning allows the system designer to reserve CPU cycles for a process or group of processes. The designer can thus guarantee that the load on one subsystem or partition won't affect the availability of other subsystems. Unlike static approaches, however, adaptive partitioning uses standard priority-based scheduling when the system isn't under full CPU load or attack.

As a result, threads in one partition can access any spare CPU cycles unused by threads in any other partition. This approach, uffers the best of both worlds: it can enforce CPU guarantees when the system runs out of excess cycles (for maximum security and guaranteed availability of lower-priority services) and can dispense free CPU cycles when they become available (for maximum performance).

Figure 3 - Adaptive partitioning prevents high-priority tasks from consuming more than their assigned CPU percentage, unless the system contains unused CPU cycles. For instance, tasks A and D can run in time allocated to Partition 3 because tasks E and F don't require the rest of their budgeted CPU cycles.

"Dualing" Kernels
GPOSs - including Linux, Windows, and various flavors of Unix - typically lack the realtime mechanisms discussed thus far. Nonetheless, vendors have developed a number of realtime extensions and patches in an attempt to fill the gap. There is, for example, the dual-kernel approach, in which the GPOS runs as a task on top of a dedicated realtime kernel (see Figure 4 below). Any tasks that require deterministic scheduling run in this kernel, but at a higher priority than the GPOS. These tasks can thus preempt the GPOS whenever they need to execute and will yield the CPU to the GPOS only when their work is done.

Unfortunately, tasks running in the realtime kernel can make only limited use of existing system services in the GPOS - file systems, networking, and so on. In fact, if a realtime task calls out to the GPOS for any service, it will be subject to the same preemption problems that prohibit GPOS processes from behaving deterministically. As a result, new drivers and system services must be created specifically for the realtime kernel, even when equivalent services already exist for the GPOS.

Also, tasks running in the realtime kernel don't benefit from the robust MMU-protected environment that most GPOSs provide for regular, nonrealtime processes. Instead, they run unprotected in kernel space. Consequently, a realtime task that contains a common coding error, such as a corrupt C pointer, can easily cause a fatal kernel fault.

To complicate matters, different implementations of the dual-kernel approach use different APIs. In most cases, services written for the GPOS can't easily be ported to the realtime kernel, and tasks written for one vendor's realtime extensions may not run on another's.

Figure 4 - In a typical dual-kernel implementation, the GPOS runs as the lowest-priority task in a separate realtime kernel.

Realtime Patches for GPOS Kernels
Rather than use a second kernel, other approaches modify the GPOS itself, such as by adding highresolution timers or a modified process scheduler. Such approaches have merit, since they allow developers to use a standard kernel (albeit with proprietary patches) and programming model. Moreover, they help address the requirements of reactive, event-driven systems. Unfortunately, such low-latency patches don't address the complexity of most realtime environments, where realtime tasks span larger time intervals and have more dependencies on system services and other processes than do tasks in a simple event-driven system.

For instance, in systems where realtime tasks depend on device drivers, file systems, or other services, the problem of priority inversion would have to be addressed. In Linux, the driver and virtual file system (VFS) frameworks would effectively have to be rewritten - along with any device drivers and file systems that employ them. Without such modifications, realtime tasks could experience unpredictable delays when blocked on a service. As a further problem, most existing Linux drivers aren't preemptible.

To ensure predictability, programmers would also have to insert preemption points into every driver in the system. All this points to the real difficulty, and immense scope, of making a GPOS capable of supporting realtime behavior. This isn't a matter of "RTOS good, GPOS bad," however. GPOSs such as Linux, Windows XP, and the various Unixes all serve their intended purposes very well. They only fall short when forced into deterministic environments they weren't designed for, such as in-car telematics systems, medical instruments, and continuous media applications.

Extending the RTOS
Still, there are undoubted benefits to using a GPOS, such as support for widely used APIs and, in the case of Linux, the open source model. With open source, a developer can customize OS components for application-specific demands and save considerable time troubleshooting.

The RTOS vendor can't afford to ignore these benefits. Extensive support for POSIX APIs - the same APIs used by Linux and various Unixes — is an important first step. So is providing well-documented source code and customization kits that address the specific needs and design challenges of embedded developers.

The architecture of the RTOS also comes into play. An RTOS based on a microkernel design, for instance, can make the job of OS customization fundamentally easier to achieve. In a microkernel RTOS, only a small core of fundamental OS services (for instance, signals, timers, scheduling) reside in the kernel itself.

All other components - drivers, file systems, protocol stacks, applications - run outside the kernel as separate, memory-protected processes (see Figure 5, below). As a result, developing custom drivers and other application-specific OS extensions doesn't require specialized kernel debuggers or kernel gurus. In fact, as user-space programs, such extensions become as easy to develop as standard applications, since they can be debugged with standard source-level tools and techniques.

For instance, if a device driver attempts to access memory outside its process container, the OS can identify the process responsible, indicate the location of the fault, and create a process dump file viewable with source-level debugging tools. The dump file can include all the information the debugger needs to identify the source line that caused the problem, along with diagnostic information such as the contents of data items and a history of function calls.

Such an architecture also provides superior fault isolation and recovery: If a driver, protocol stack, or other system service fails, it can do so without corrupting other services or the OS kernel. In fact, "software watchdogs" can continuously monitor for such events and restart the offending service dynamically, without resetting the entire system or involving the user in any way. Likewise, drivers and other services can be dynamically stopped, started, or upgraded, again without a system shutdown.

These benefits shouldn't be taken lightly - the biggest disruption that can occur to realtime performance is an unscheduled system reboot! Even a scheduled reboot to incorporate software upgrades disrupts operation, though in a controlled manner. To ensure that deadlines are always met, developers must use an OS that can remain continuously available, even in the event of software faults or service upgrades.

Figure 5 - In a microkernel RTOS, system services run as standard, user-space processes, simplifying the task of OS customization.

A Strategic Decision
An RTOS can help make complex applications both predictable and reliable; in fact, the precise control over timing made possible by an RTOS adds a form of reliability that cannot be achieved with a GPOS. (If a system based on a GPOS doesn't behave correctly due to incorrect timing behavior, then we can justifiably say that the system is unreliable.) Still, choosing the right RTOS can itself be a complex task. The underlying architecture of an RTOS is an important criterion, but so are other factors. These include:

1) Flexible choice of scheduling algorithms - Does the RTOS support a choice of scheduling algorithms (FIFO, round robin, sporadic, etc.)? Can you assign those algorithms on a per-thread basis, or does the RTOS force you into assigning one algorithm to all threads in your system?

2) Guaranteed CPU availability - Does the RTOS support a partitioning scheduler that provides tasks with a guaranteed percentage of CPU time, regardless of what other tasks, including higher-priority tasks, are doing? Such guarantees simplify the job of integrating subsystems from multiple development teams or vendors. They also ensure that critical tasks can remain available and meet their deadlines, even when the system is subjected to denial of service attacks and other malicious exploits.

3) Graphical user interfaces - Does the RTOS use primitive graphics libraries or does it provide advanced graphics capabilities such as multi-layer interfaces, multi-headed displays, accelerated 3D rendering, and a true windowing system? Can you easily customize the GUI's look-and-feel? Can the GUI display and input multiple languages (Chinese, Korean, Japanese, English, Russian, etc.) simultaneously? Can 2D and 3D applications easily share the same screen?

4) Tools for remote diagnostics - Because downtime is intolerable for many embedded systems, the RTOS vendor should provide diagnostics tools that can analyze a system's behavior without interrupting services that the system provides. Look for a vendor that offers tools for code coverage, application profiling, system profiling, and memory analysis.

5) Open development platform - Does the RTOS vendor provide a development environment based on a open platform like Eclipse, which lets you "plug in" your favorite third-party tools for modeling, version control, and so on? Or is the development environment based on proprietary technology?

6) Internet capabilities - Does the RTOS support an up-to-date suite of preintegrated protocol stacks, such as IPv4, IPv6, IPsec, SCTP, and IP filtering with NAT? Does it also support an embedded Web browser? The browser should have a scalable footprint and be capable of rendering standard Web pages on very small screens. It should also support standards such as HTML 4.01, SSL 3.0, CSS 1 and 2, OMA , JavaScript, WAP, and WML.

7) Standard APIs - Does the RTOS lock you into a proprietary API, or does it provide full support for a standard API like POSIX, which makes it easier to port code to and from other OSs? Also, does the RTOS offer comprehensive support for the API, or does it support only a small subset of the defined interfaces?

8) Support for multi-core processors - The ability to migrate to multi-core processors is becoming a requirement for a variety of high-performance designs. Does the RTOS support a choice of multi processing models (symmetric multiprocessing, asymmetric multiprocessing, bound multiprocessing) to help you take best advantage of multi-core hardware? And is the RTOS supported by system-tracing tools that let you diagnose and optimize the performance of a multi-core system? Without tools that can highlight resource contention, excessive thread migration between cores, and other problems common to multi-core designs, optimizing a multi-core system can quickly become an onerous, time consuming task.

9) Source code kits - Does the RTOS vendor provide well-documented source and customization kits to help tailor the RTOS to your specific requirements? Does the vendor also offer driver development kits, including source code, to help you develop drivers for custom hardware? Choosing an RTOS is a strategic decision for any project team. Once an RTOS vendor has provided clear answers to the above questions, you'll be much closer to choosing the RTOS that's right for you now — and in the future.

This article is excerpted from a paper of the same name presented at the Embedded Systems Conference Boston 2006. Used with permission of the Embedded Systems Conference. For more information, please visit www.embedded.com/esc/boston/

Paul N. Leroux is Technology Analyst and Jeff Schaffer is Senior Applications Engineer at QNX Software Systems

1

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Ready to take that job and shove it?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS




 :