As discussed in Part 1 and Part 2 in the series, the majorimplication of the extra context switches introduced by the use ofunique priorities is the additional RTOS overhead those contextswitches cause. In order to measure the extra overhead introduced bythe additional context switches, we measure the clock count differencesbetween the cycle-boundary relinquish events (Figure 10, below ).
|Figure10 – Cycle Start and Cycle End event time stamps show elapsed time fora complete cycle. Case-1 (Equal Priorities) is shown on the left, whileCase-2 (Unique Priorities) is shown on the right.|
Each event has a unique time-stamp taken from the system clock.Subtracting one “RO” time stamp from the next yields the elapsed timefor the cycle. As a result of the additional context switches andpreemptions caused by the use of unique priorities for each thread inCase-2, the application suffers increased overhead.
In this timed experiment, we use an RTOS with a 0.35?s contextswitch, but in practice, the additional context switch operations canadd from 50 to 500 CPU cycles per context switch, depending on the RTOSused. Figure 11 below showsthe results that were observed:
|Figure11 – Timing measurements show a significant increase in overhead withunique priorities|
This experiment shows that fourteen additional context switches wereperformed in Case-2, and total processing time increased by more than80%, while the exact same number of messages were sent and received.
If context switch operations were not as fast as the 0.35?s in thisexample, the impact on total processing time would be even greater. Theresults here show more than an 80% increase in RTOS overhead as aresult of using unique priorities for each application thread.
The use of unique priorities might also make system performanceunpredictable. Loss of predictability occurs because the context switchoverhead varies as a result of the sequence of thread activation,rather than in a prescribed round-robin fashion, as in Case-1.
The developer can eliminate this aspect of unpredictability byassigning multiple threads the same priority since there will always bea consistent number of context switches to perform a given amount ofwork. If distinct priorities are assigned to each thread, then theapplication developer needs to be acutely aware of the potential forvariance in system performance and responsiveness.
When to Use Unique Priorities
While use of unique priorities might result in more context switchesthan running multiple threads at the same priority, in some instancesit is the appropriate thing to do. For example, if latency is moreimportant than throughput, in the previous example, we would wantThread A to run as soon as a message arrives in its queue, rather thanwaiting for its round-robin turn.
To make sure that happened, we'd make Thread A higher in prioritythan Thread D. Likewise with Threads B and C. We would achieve lowerlatency, but at the expense of cutting our throughput from 8.7 messagesper 1000 tics, to 4.8 messages per 1000 tics, as can be seen from thefollowing table in Figure 12 below:
|Figure12 – Table showing tradeoff between Message Processing Latency andMessage Throughput. Unique priority assignment reduces latency but alsoreduces throughput.|
Priority Inheritance and Time-Slicing
Developers can best deal with the somewhat uncertain context switchoverhead caused by thread priority selection by keeping as many threadsas possible at the same priority level. In other words, only usedifferent priority levels when latency outweighs throughput andpreemption is absolutely required ” never in any other case.
Furthermore, running multiple threads at the same priority makes itpossible to properly address other system requirements such as priorityinheritance, round-robin scheduling, and time-slicing. Each of thesemechanisms is important in a real-time system, and each can be used tokeep system overhead low and—perhaps more importantly—to keep systembehavior understandable.
Priority Inheritance is a mechanism employed by an RTOSto prevent deadlock in a case of priority inversion. Priority inversionoccurs when a low priority thread owns a resource needed by a higherpriority thread, but the lower priority thread gets preempted by athread with a priority between the two.
Thus, the low priority thread cannot run, since it has beenpreempted by a higher priority thread, and the resource it holds cannotbe released. The high priority thread ends up waiting for the lowpriority thread to release the resource, effectively deadlocking thehigh priority thread.
Priority inheritance allows the low-priority thread to temporarilyassume the priority of the high-priority thread waiting for theresource. This allows the low-priority thread to run to the point whereit can release the resource, and then the high-priority thread can getit and run. If all threads had to have a unique priority, thelow-priority thread could not assume the priority of the waitingthread, since that priority is already in use.
Likewise, with a limit on the number of threads at a given priority,it will be impossible for the low priority thread to be raised to thelevel of the high priority thread if there are already the maximumnumber of threads at that priority. If the limit of threads at apriority is 1, then priority inheritance is never possible, and someother solution to priority inversion must be found.
Round-Robin Scheduling is a method used by an RTOS to runthreads in a circular fashion, letting each thread run until it becomes”blocked,” or relinquishes its turn. In other words, none of thethreads is preempted by a higher-priority thread.
Round-robin scheduling allows multiple equally important activitiesto run at the same priority, while still retaining individualencapsulation. This approach is used in our Case-1 above, where allfour threads have a priority of 4, and run in sequence withoutpreemption from higher-priority threads.
Time Slicing is an RTOS scheduling method that distributesCPU cycles to multiple threads in a weighted manner. Most commonly, itis used to give threads at the same priority level a certain number ofCPU cycles, rotating from thread to thread and then back to thebeginning of the group continuously as long as those threads remainactive. It also can be used to allocate percentages of CPU time to eachthread.
For example, giving Thread A 25%, Thread B 10%, Thread C 10%, andThread D 55% of the CPU cycles while that priority is active. This isgenerally achieved by dividing an arbitrary number of CPU cycles (a”block”) into proportional parts.
In this example, a block of 1000 cycles might be used (e.g., 5?s ona 200MHz system) where Thread A executes for 250 cycles, Thread B for100 cycles, Thread C for 100 cycles, and Thread D for 550 cycles, thenreturning to Thread A for another 250 cycles and so on. This allocationenables system designers to provide more time for threads theydetermine to require more operations, but not to preempt the otherthreads at the same priority.
Assigning multiple threads the same priority can have manybeneficial effects and can help system designers avoid traps thatthreaten the proper operation of their real-time system. In particular,it can reduce overhead, increase throughput, and enable priorityinheritance and time-slicing scheduling methodologies. The developer isencouraged to use as few distinct priorities as possible and to reserveunique priorities for those instances where true preemption isrequired.
William E. Lamie is co-founder andCEO of ExpressLogic, Inc.,and is the author of the ThreadX RTOS. Prior to founding Express Logic,Mr. Lamie was the author of the Nucleus RTOS and co-founder ofAccelerated Technology, Inc. Mr. Lamie has over 20 years experience inembedded systems development, over 15 of which are in the developmentof real-time operating systems for embedded applications.