Improve CPU Utilization with DEOS Slack RTOS Scheduling -

Improve CPU Utilization with DEOS Slack RTOS Scheduling

This “Product How-To” article focuses how to use a certain product in an embedded system and is written by a company representative.

Safety and mission-critical systems have key software activities that require guaranteed, deterministic access to the CPU, even in the face of misbehaving lower criticality software. Time partitioning provides such guarantees, but they can come at the cost of inefficient CPU utilization.

The Deos real-time operating system with slack scheduling satisfies mission-critical requirements while providing developers with simplified access to the CPU's full performance.

A time-partitioned real-time operating system that conforms to mission-critical specification ARINC-653 will guarantee that a specific computation will have access to the CPU for a specific amount of time (i.e., budget) at a bounded, deterministic location within the scheduling timeline's major frame (i.e., the hyper-period).

This guarantee is a key enabler towards the development of highly integrated systems that allow software of varying degrees of criticality to coexist on the same platform.

By assigning computational tasks (processes) to run at specific times within the hyper-period, and budgeting enough time for worst case execution of each process, developers can develop a timeline that ensures high criticality applications will have the CPU time and deterministic operation they require. The RTOS terminates any lower criticality applications that exceed their budget to prevent any interference.

This time-partitioning approach works well for periodic computations that have a small deviation between their worst case execution and their nominal case execution. The timing guarantees come at a price, however.

By setting aside worst case execution times for each process, the scheduler ensures worst case system performance, every time. In other words, from a CPU bandwidth perspective, it is as if every time the hyper-period is executed, every computation always experiences its worst case execution.

The approach thus leads to the all-too-common occurrence that the system runs out of CPU budget time (i.e., no more time available to allocate in the hyper-period), while profiling shows actual average CPU utilization of 50% (or less in some scenarios).

The situation arises because, in the vast majority of cases, computations experiencing their worst case executions are rare. The occurrence of multiple computations experiencing their worst case executions during the same period is even more improbable. Thus, for the vast majority of executions, unused CPU budget shows up as useless CPU idle time . (Figure 1 below )

Figure 1. ARINC-653 Time partitioning often yields unused CPU time.

Some common scenarios further exacerbate the problem: First, the system must perform aperiodic activities, such as interrupts to service, aperiodic client-server exchanges, etc. Second, the CPU budget required in order to guarantee safe execution is insufficient to meet performance needs; and third, significant deviation exists between nominal case and worst case execution times.

The need to poll for the occurrence of an interrupt, for example, increases unused CPU idle time because of the requirement to budget worst-case time for interrupt handling within each hyper-period, even if the expected interrupt rate is low.

A byproduct of this scheduling can be significant latency in interrupt response. If the interrupt occurs after the handler's scheduled time with in the hyper-period, the system cannot respond until the next scheduled execution of the handler. Thus, latency can be as great as a full hyper-period.

DDC-I's Deos rate monotonic scheduling (RMS) RTOS with patented slack scheduling technology provides an alternative approach to the fixed time partitioning of ARINC-653 that guarantees time and scheduling of processes (called threads in Deos) while also enabling designs to utilize up to 100% of the CPU's processing potential.

One key is allowing the RTOS to schedule thread execution, subject to any needed constraints, at run time rather than having the schedule fixed at software integration time as in ARINC-653. The other key is recovering CPU slack time for use by threads as needed.

What is slack and how do you use it?
When considering slack, it is often helpful to think of a bank account for time into which deposits are made. Threads (i.e., computations) that need more time than they have been budgeted can withdraw from this slack account until the account balance reaches zero. Where do these deposits come from (i.e., what are the sources of slack time)? There are two sources:

1) Budgeted CPU time that goes unused during a thread's execution; and
2) Unallocated CPU time (i.e., when adding up the total budgeted CPU time, the sum ” for the hyper-period ” is less than 100%).

Thus, at the beginning of each hyper-period the slack account has a balance equal to the total unallocated CPU time. As threads execute and complete early (with respect to their worst case budget) they donate their remaining unused budgeted time to the slack account (i.e., they make implicit deposits).

Conversely, as threads execute and wish to use slack time, they make explicit withdrawals. The Deos scheduler manages the deposits and withdrawals to ensure time partitioning and system schedule-ability.

How does one use slack? Developers explicitly identify threads, at design time, as slack requesters. All threads participate in generating slack (i.e., making deposits of time into the account), but only slack requesters are allowed to consume slack (i.e., make withdrawals from the slack account). A slack requester must first use all of its budgeted CPU time ( In Deos, each thread must have at least enough fixed budget for one context switch ).

Once a slack requester has depleted all of its budgeted CPU time, Deos will give that thread immediate access to all available slack time (i.e., all of the time in the slack account). The thread can use all, or a portion, of the available slack time (at its discretion).

If the slack requester uses all of the available slack time, and a subsequent thread generates slack (i.e. deposits into the slack account), the slack requester could be scheduled again and given access to the recently generate slack time.

To guarantee high-criticality threads their access to the CPU, the scheduler allows developers to specify a thread's budget, and creates interrupt timers that can force a thread to stop execution if it has exceeded its time budget. Further, Deos always schedules execution of the highest priority thread that is ready-to-run.

If, for example, the high-priority thread Fast (per RMA, rate = priority) must execute each minor frame, Deos will schedule that thread to execute at the beginning of the minor frame and allow it to run to the end of its budget (Figure 2 below ). The low-priority thread Slow , which only needs execution once per hyper-period, is scheduled only after all higher-priority threads are finished for their time frame.

Note that in this example, the thread Medium is a slack requester and received therefore was given access to the slack account once it consumed its budget.

Figure 2. Slack scheduling allows threads to consume more than their budgeted time if needed, while still ensuring that high-priority threads execute first and fully.

This use of thread rate-monotonic scheduling and Deos' patented slack scheduling technology ensures that high priority threads always execute fully during a schedule period and that slack will be consumed by the highest priority slack requester first.

In this way, slack is a form of load shedding. With Deos, therefore, all available CPU time can be utilized because threads with nominal execution times less than their worst case execution times will be generating time to be used by slack requesting threads. No unused time is lost to idle.

The ability to utilize slack gives designers a variety of opportunities for enhancing system performance as well as increasing CPU utilization. One of the most common uses of slack, for instance, is to remove the lowest criticality applications from the fixed-budget time line. In other words, run your low criticality applications purely on slack so that you have more fixed time budget available for higher criticality applications.

The Deos development environment provides a classic example of the benefit of this approach. All of the development environment's Ethernet based applications (e.g., the network stack, FTP server, Telnet server, etc.) execute purely on slack; even the network's aperiodic interrupt service routine.

Before slack scheduling, these applications demanded over 50% of fixed budgeted CPU time in order to achieve the customer's expected performance. With the advent of slack, fixed budgeted CPU time for these applications dropped 80%, while yielding a 300% increase in performance; definitely a win/win.

Deos' slack scheduling technology provides another key advantage particularly useful in the client-server arena, namely: the ability for threads to execute multiple times within the same period.( This capability is also possible when using thread budget transfer. )

This nuance of slack scheduling allows a client thread and its server thread to exchange data, perhaps multiple times, back-to-back, within the same period, in order to complete a transaction. By contrast, in other time partitioning schemes clients must wait for their server thread to be scheduled.

When a transaction takes multiple interactions, the delay can be significant. To reduce that delay, non-Deos users are forced to 'play scheduler' and hand-craft a hyper-period timeline that balances the needs of time critical applications with the needs of (perhaps multiple) less time-critical client-server application. Deos users can use slack and let the Deos scheduler take care of this balance for them.

Another common use of slack is the ability to budget in order to meet a safety requirement, but enable slack in order to get the most out of a processor. For example, let's say you have a safety requirement that indicates your display should update at least ten times a second. While this update rate is deemed safe, it falls short of meeting customer expectations.

In this case, you could chose to budget for your 10-Hz update rate (i.e., ensure you have enough fixed budgeted CPU time to meet your 10 Hz rate) while enabling slack. Instead of your display performing in worst case mode all the time, it is guaranteed to meet its minimum safety requirements but will perform in its best case mode as often as possible (i.e., you'll get all the performance your hardware has to offer).

Additionally, slack can allow you to address your software requirements in a way commonly used before time partitioned operating systems: By assigning a requirement that must be accomplished to a low priority thread and guaranteeing that it is accomplished by monitoring its activity from a high priority thread.

For example, one could meet their continuous built-in test (CBIT) requirement by assigning that activity to a low priority, pure slack thread and then monitoring the CBIT thread's adequate completion rate from the high priority, fixed CPU budget thread.

Once again, the Deos scheduler helps spread the CPU load across the timeline (vs. the user/designer having to 'play scheduler') and the high priority, fixed budget thread guarantees that the activity is occurring per the specification.

In addition to slack, Rate-Monotonic Deos scheduling can simplify software maintenance and upgrades. With time-partitioning, the introduction of a new thread may be difficult if there is no room available on the timeline.

For example, as shown in Figure 3 below the introduction of new task D with 3 msec budget and 10 msec execution period into a 40 msec hyper-period is not possible without a complete restructuring of the timeline because one of the 10 msec minor frames does not have enough room even though the hyper-period has plenty of slack elsewhere. With Deos scheduling, the RTOS will automatically utilize the available time.

Figure 3. ARINC-653 can make it impossible to add new tasks without manually revising the timeline, while Deos scheduling automatically accommodates the insertion.

A secondary maintenance benefit of Deos scheduling is that developers must explicitly define the rate and budget requirements a thread must have so that Deos will schedule it properly.

With ARINC653 time-partitioning, developers make their scheduling decision based on those requirements, but once the design is complete the understanding of those requirements may be lost. Developers seeking to later modify existing code may not be able to determine a rationale (if any) behind the scheduling choices.

Deos and its patented slack scheduling technology thus enable software designers to easily leverage all the power of today's modern processors, without sacrificing the safety of space & time partitioning. The use of budgets and rates allows the RTOS to enforce required timing at the thread level while still utilizing available slack time, simplifying the designer's task.

By giving software designers the capability to factor timing computations into slack scheduling and/or the high criticality fixed budget timeline, Deos also allows creative use of CPU time to maximize system performance without risk.

Bill Cronk is the Deos product line manager at DDC-I. Prior to joining DDC-I, Bill was a software engineer at Kutta Technologies. He was also a technical manager at Honeywell Aerospace, where he had a key role in the development, deployment, and certification of Deos on numerous airframes. Bill holds a Bachelor's of Science in Computer Science from Grand Canyon University.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.