Scheduling Sporadic Events - Embedded.com

Scheduling Sporadic Events

Sporadic events are the bugaboo of many real-time systems. Finding a way to manage aperiodic behavior can spell the difference between system failure and system success. This article presents the POSIX sporadic scheduling policy, which programmers can use to avoid critical deadline failures in overloaded systems.

Most real-time operating systems use a priority-based, preemptive scheduler to share a single CPU among multiple threads. Such a scheduler is optimal for systems composed entirely of constant duration periodic threads, all of which experience only bounded external stimuli. However, the reality is far from the ideal: software systems are routinely subjected to unbounded external stimuli and are required to perform time-varying computations. In these situations, events can drive systems built with typical RTOS schedulers into overload conditions that fail to satisfy critical deadlines.

Theory vs. reality

The design team for one of my previous projects spent several months specifying the software architecture for a system controller in a digital switch. Because a telecommunications switch must meet demanding time constraints, we selected an operating system that claimed to be ideal for real-time applications and then turned the system and its operating system over to the application engineers.

A few months later, engineers on the product's quality assurance (QA) team began reporting intermittent system deadlocks. During these lockups, the user interface did not respond to keyboard activity and the processor activity LED remained lit 100% of the time. Yet a firmware-driven sanity lamp was flashing its heartbeat, as if nothing were wrong at all.

Because the processor LED indicated continuous kernel activity and the highest priority heartbeat interrupt service routine was running, we were able to infer that the deadlock was due to a medium priority interrupt service routine caught in an infinite loop. Because the fault was intermittent, finding the cause was particularly challenging. At first, the problem occurred on only one system; soon thereafter, it surfaced on another. Eventually, we discovered a short-circuiting trace, on a board QA engineers were shuffling between systems, which was raising a flash-memory controller's level-sensitive interrupt request signal. Although the system was able to execute the high-priority heartbeat thread, it was unable to respond to low-priority keyboard I/O because it was constantly servicing the medium-priority flash controller interrupt thread.

Over the years, we encountered similar deadlocks: network packet storms overloading network interface controllers, unbounded and bursty character I/O overloading serial controllers, and SCSI bus jabbering overloading disk controllers. We wondered why external events could so easily compromise our system. It turned out reality simply failed to conform to our simplifying assumptions: our threads were not strictly periodic, and we had failed to architect our system to manage jitter and overload. We learned that a real-time operating system is only the beginning of a solution.

Real-time terminology

Term Definition
aperiodic event An event with an arrival pattern that lacks a bounded minimum interval between subsequent instances.
aperiodic task A task released by an aperiodic event. Some add the qualification that the releasing event has only a soft response deadline.
bounded Constrained by some upper or lower limit.
interarrival time The amount of time between subsequent instances of identical events.
jitter Variation in a parameter. For example, variation in arrival rates, computation durations, or communication latencies.
periodic event An event with an arrival pattern that has a bounded minimum interarrival time with minimal jitter.
periodic task A task released by a periodic event whose computation interval excluding preemption and blocking is constant.
release The activation of a task such that it will be considered by the RTOS for execution on the CPU.
sporadic event An event with an arrival pattern that has a bounded minimum interarrival time with stochastic jitter.
sporadic task Either a task released by a sporadic event or a task with a stochastic computation interval. Some classify a sporadic task as one released by an aperiodic event that has a hard response deadline.
stochastic Dependent on one or more random variables. For example, a dependency on absolute time of day or on contextual information.

Prioritization

Static-priority preemptive schedulers require that an engineer assign unique priorities to individual threads to inform the scheduler which thread to run when multiple threads are ready. There are several formal methods-and, of course, numerous ad-hoc methods-for determining the relative priority of specific threads. Methods that relate a thread's priority to its time constraints outperform those that base priorities solely on subjective measures such as the importance of a thread's role.

Two optimal priority assignment methods are the deadline monotonic (DM) and rate monotonic (RM) algorithms. These assign increasing priority to threads with shorter deadlines and periods, respectively. The optimality of these methods is such that if one can feasibly schedule a system with any static-priority algorithm then one can schedule it using a DM or RM algorithm. Furthermore, if these optimal algorithms cannot feasibly schedule the threads of a particular application, then no static-priority assignment can.

Although static-priority preemptive scheduling cannot schedule all possible systems, designers frequently prefer it and commercial RTOSes provide it because it offers greater stability and predictability than dynamic priority scheduling. Static-priority preemptive schedulers are also relatively easy to implement and impose minimal operating system overhead.

Digital control

Digital control systems are ideal applications for static-priority preemptive scheduling. These systems are pervasive, appearing in navigation devices, home appliances, industrial manufacturing, chemical processing, and automobiles.


Figure 1: Sampled-data representation of typical SISO system

Figure 1 diagrams a simple single-input/single-output (SISO) control system. More complex control systems manage multiple plants at multiple rates and are multiple-input/multiple-output (MIMO) controllers.

Using a standard RTOS, a programmer can easily implement a MIMO controller with three parallel closed-loop control threads running at different rates to control three distinct plants. Because they have a well-defined, restricted, strictly periodic thread set, such control systems conform to the assumptions of classical rate monotonic analysis and are ideal applications for static-priority preemptive scheduling.

However, many practical systems employed in real-time scenarios also experience aperiodic or sporadic external events. Such systems include network and telecommunications equipment, multimedia players, medical devices, and industrial process control systems; or, more generally, any system that connects to an environment with potentially misbehaving agents or agents acting at random intervals. For example, a plant manager can at any unforeseen moment switch an industrial process from production mode to safety mode. The threads that transition the system between these two modes are sporadic threads.

Likewise, many practical applications contain threads with stochastic computation intervals such as a data processor that performs different computations depending on the information within the data rather than a single computation that is dependent only on the quantity of data.

Overload

A system goes into overload when events' interarrival times are shorter than planned or when threads' execution intervals are longer than planned. Under these conditions, the system must discard events or allow one or more threads to miss their deadlines in order to allow more critical threads to complete on time.

Overload results when peripherals and external agents conform to their true nature rather than to designers' idealized assumptions. Hardware devices may fail, generating spurious signals greatly out of line with proper behavior. Errant or malicious code, improperly trained or malicious users, and unusual numbers of external agents can generate excessive event traffic that violates designers' statistical assumptions.

Several practical examples include Mother's Day callers, who are notorious for overloading the phone network; network packet broadcast storms; focused denial of service attacks; large files fed at high speed into serial ports; malfunctioning disk controllers; faulty sensors; electromagnetic disturbances; errant, dynamically loaded device driver code that forgets to clear interrupt conditions; and the stuck active interrupt request signals my team encountered.

The choice of scheduling policy makes a dramatic difference in how the system behaves in overload, and determines whether a designer can predictably characterize the system during overload. For example, the dynamic priority earliest deadline first (EDF) scheduling policy performs poorly in overload. Since EDF is used to assign thread priorities on the fly, the designers of the system cannot predict which specific threads will miss their deadlines during overload conditions. Furthermore, an overrunning thread may cause other future threads to miss their deadlines.

On the other hand, a designer using a static-priority policy can perform schedulability analysis to assure that the deadlines of critical threads will be satisfied even in overload. Employing a predictable scheduling policy is especially important for safety-critical systems, which must remain predictable even in overload. For example, an overloaded system that misses a few user interface updates is preferable to one that fails to close control valves.

While poorly designed systems often fail to satisfy the constraints of both critical and less important threads in overload, well-designed systems must selectively process events to assure that the deadlines of critical threads will always be met. POSIX's sporadic scheduling policy makes good design easy.

Sporadic scheduling

IEEE's POSIX specification suite describes a programming interface for portable operating systems. In 1999, the POSIX Working Group introduced IEEE 1003.1d to specify additional real-time extensions for the C language. Section 13.2.4 of that standard adds the Sporadic Scheduler policy to the Round Robin and FIFO policies previously standardized. The POSIX Working Group specified the sporadic scheduling policy precisely for handling aperiodic and sporadic events and threads running within the context of a static-priority preemptive scheduler.

In addition to the single, normal-priority level used for FIFO and Round Robin scheduling, the parameters of a sporadic policy include a second, lower background priority; a replenishment interval; an execution budget; and a maximum number of replenishments allowed within each replenishment interval. Together, the replenishment interval and execution budget define the maximum processor utilization granted to a thread at its “normal” priority.[1]

If a thread ever requires more processor time than its budget allows, the sporadic scheduling policy dynamically lowers the priority of the thread to its “background” priority. While at either priority, the scheduler treats the thread identically to other threads with the conventional FIFO scheduling policy. For example, priority inheritance and positioning within ready queues remain the same-the important differences are the enforcement of the thread's execution time and priority reassignment of the thread.

By lowering the thread's priority, the system allows other threads to execute that the sporadic thread would otherwise preempt at its normal priority. Dynamic adjustment of thread priorities is the key capability of the policy that enables programmers to build predictable systems.

A sporadic thread consumes execution time whenever it is running.[2] The scheduler deducts consumed time from the thread's budget each time the thread blocks or the scheduler preempts it.

When a sporadic thread is blocked (but not preempted), the sporadic scheduler allocates a new replenishment event one replenishment period after the most recent unblocking of the thread. The scheduler sets the amount of execution budget that it will grant to the thread at that replenishment to the amount of time the thread ran since it unblocked.

A thread may contend for multiple resources with lower priority threads during a single period and, therefore, might have several replenishment events outstanding. To minimize scheduler overhead, the POSIX specification enables a designer to limit the number of outstanding replenishment events per period. If the maximum is reached, no additional replenishment events are scheduled within the replenishment period.

If and when a sporadic thread consumes its entire execution budget, the scheduler is informed so it can lower the thread's priority to its background level. The thread remains at this priority until a replenishment event occurs, at which point the scheduler promotes the thread back to its normal priority. If the thread is able to run at its background priority because no higher priority thread is ready to run, then the scheduler allows the sporadic thread to run and does not deduct time from its execution budget.

These consumption and replenishment policies assure that within any interval of time equal to the thread's replenishment interval, the thread is allowed to execute only for as long as its computation budget indicates. The scheduler thereby forces the thread's processor utilization to conform to the designers' expectations and to the assumptions of optimal rate monotonic static-priority preemptive scheduling. This utilization enforcement is what assures predictability of the system even when in overload.


Figure 2: Consumption and replenishment of execution budgets

To clarify these concepts, Figure 2 shows a sporadically scheduled thread that begins with an execution budget of eight units and a 12-unit replenishment period. Once released, the thread runs for four units before blocking. The scheduler deducts four units from the thread's budget and schedules a replenishment of four units to occur 12 units after the thread started. Unblocked, the thread runs for three units before blocking a second time. The scheduler deducts three units from the budget and schedules a three unit replenishment to occur twelve units after the thread unblocked. Unblocked the second time, the thread executes for one unit, whereupon the scheduler detects the exhaustion of the execution budget and downgrades the thread's priority. The thread is free to run at the background priority if no other higher priority threads preempt it.

Note that the resolution of the operating system's timer tick limits the accuracy of the sporadic scheduler. A low-resolution 10ms tick, wouldn't yield good results for threads with periods and budgets measured in milliseconds.

An example

Let's see how we can put a sporadic scheduling policy into practice by examining the reception thread of a network interface controller's (NIC) device driver.

Applications communicate over packet networks via device drivers attached to network interface controllers (NICs). These controllers implement the physical and data link layers of the physical medium's communication protocol to provide node-to-node connections without errors, to arbitrate for the medium, and to handle the physical and electrical specifications necessary to transmit data across the medium. A network device driver contains at least one thread for receiving packets as part of interrupt service, and another for transmitting packets.

Designers make several assumptions when employing a multi-node communications medium into their designs, including expectations on the utilization of the medium, the bandwidth of the medium, the average and worst-case packet sizes, and the expected interval between packets destined for the connecting system. With these assumptions, the designers determine an appropriate period, computation interval, and priority for the packet reception thread(s).

My system needed to connect to a 100Mbps network, would receive constant sized (64-byte) packets, required 25ms of processing per packet, and expected to receive packets once every 64ms. I selected an NIC that interrupted the host processor for each packet received, yet also included a 32-entry ring buffer. I planned to protect against packet drops by never letting the ring buffer become more than 50% full. Processing all packets within the time constraints was a goal, but never overloading the system was the requirement. To ensure this, I allowed the system to drop packets if absolutely necessary.

Table 1: Packet reception task utilization parameters

Parameter Duration
NIC interrupt request interarrival 1024µs
Thread computation requirement 400µs
Effective maximum utilization 39

Given the requirements in Table 1, I created a periodic thread to handle packet reception and assigned it a rate monotonically derived priority based on its 1,024ms period. To keep the ring buffer utilization under 50%, the period is 16×64µs and the computation interval is 16×25µs. (I also used an operating system that uses extremely short interrupt exception handlers and performs interrupt service handling within a prioritized thread.)

I realized that if the incoming rate exceeded 15,625 packets per second, the system would go into overload. To protect the system against this, I switched the reception thread's default FIFO policy to a sporadic scheduling policy and so restricted its high priority utilization of the processor to the desired worst case amount, 39%. I accomplished this by initializing the fields of a POSIX scheduling parameters attribute to my expected period and execution budget and passing those attributes at POSIX's thread creation.

The sporadic scheduling policy ensured that the system could withstand abnormal packet bursts and would drop excessive incoming traffic, but would still satisfy the time constraints of critical threads lower in priority than the NIC reception handler.

Listing 1: Sporadic scheduling example

#define NIC_SPORADIC_BKGD_PRIORITY 5 // just above dirt
#define NIC_SPORADIC_PRIORITY 21 // above most other I/O
#define NIC_SPORADIC_MAXREPLENISHMENTS 3 // blocked only 3 times at most
#define NIC_SPORADIC_PERIOD 1024000 // 1024 microsecs
#define NIC_SPORADIC_BUDGET 400000 // 400 microsecs

// initialize an attributes structure and configure it for sporadic scheduling
if (pthread_attr_init(&pattr) != EOK) {…}

// override the default of inheriting policies from our parent
if (pthread_attr_setinheritsched(&pattr, PTHREAD_EXPLICIT_SCHED) != EOK) {…}

// configure the normal and background priorities, the budget interval, and
// the maximum number of replenishments
param.sched_ss_low_priority = NIC_SPORADIC_BKGD_PRIORITY;
param.sched_priority = NIC_SPORADIC_PRIORITY;
param.sched_ss_max_repl = NIC_SPORADIC_MAXREPLENISHMENTS;
nsec2timespec(&param.sched_ss_repl_period, NIC_SPORADIC_PERIOD);
nsec2timespec(&param.sched_ss_init_budget, NIC_SPORADIC_BUDGET);

if (pthread_attr_setschedparam(&pattr, &param) != EOK) {…}

// now set the scheduling policy to sporadic
if (pthread_attr_setschedpolicy(&pattr, SCHED_SPORADIC) != EOK) {…}

// create the thread/task and have it explicitly use sporadic parameters
if (pthread_create(&devData->threadId, &pattr, rtl_EventHandler, devData) != EOK) {…}

Listing 1 presents the POSIX C code to configure a sporadic scheduling policy for the packet reception thread of the network driver. A complete implementation of a Realtek NIC device driver for the QNX operating system is available for download from ftp://ftp.embedded.com/pub/2002/12vanzandt.

Predictably

Sporadic scheduling manages aperiodic and sporadic events and threads by wrapping them within a periodic framework. This is an effective technique for predictably handling overload scenarios on static-priority preemptive schedulers. Using sporadic scheduling, threads that would otherwise introduce undesirable jitter or deny the processor to lower-priority threads during overload can be safely included. With sporadic scheduling, we can predictably analyze very unpredictable systems.

If you are interested in incorporating sporadic scheduling into your applications, you can find implementations of the policy within the Ada95 programming language, the QNX RTOS, Java Virtual Machines compliant with the Real-Time Specification for Java (RTSJ), and several real-time variants of the Linux kernel. Note, however, that not all implementations are fully compliant with the POSIX 1003.1d specification. For example, RTSJ's sporadic scheduling policy neither uses the C bindings nor implements replenishments. Ada95's implementation uses POSIX's 1003.5 Ada bindings.

Lonnie VanZandt is a consultant. Prior to striking out on his own, he was an engineer at QNX Software Systems, TimeSys, and Bell Laboratories. He has developed rate monotonic scheduling code for the Linux kernel. Lonnie has MS and BS degrees in computer engineering from the University of Illinois. Contact him at .

References

POSIX1003.1d-1999 “IEEE Standard for Information Technology-Portable Operating System Interface (POSIX)-Part 1: System Application Program Interface (API)-Amendment d: Additional Realtime Extensions [C Language]”

http://standards.ieee.org/catalog/olis/posix.html

Endnotes

1. Typically, you use rate monotonic analysis to determine a thread's normal priority from the periodicity of a task relative to all other tasks in the system, assigning higher priorities to higher frequency tasks.
Back

2. A uniprocessor has only a single running thread at any instant but a symmetric multiprocessor might have several simultaneous running threads.
Back

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.