Real time OSes are something like a billion dollar market. But the ugly secret to multitasking is that conventional round-robin schedulers are non-deterministic. It's impossible to guarantee that all of the tasks will run in a timely fashion.
The very definition of a hard real-time system is one that suffers unacceptable failure when a deadline is missed. Yet, ironically and somewhat horrifyingly, practically every multitasking embedded app avoids failure due simply to luck since the usual scheduling algorithm doesn't allow for a provably-correct bit of software.
A perfect storm of interrupts and task requirements can bring the system down. We developers build in time margins and then crouch in a metaphorical foxhole, head down, hoping that nothing goes wrong.
A variety of other scheduling algorithms exist. Probably the best-known is Rate Monotonic Analysis. Simplifying somewhat, for a typical busy system with more than a few tasks, if the sum of (worst-case execution time of each task)/(task's period) is less than about 69%, you'll meet all deadlines if tasks are assigned priorities based on how often they run. The fastest gets the highest priorities; those that run the least often get the lowest.
RMA has been around since the 70s, and plenty of articles in Embedded Systems Design and elsewhere advocate its use. “Rate Monotonic Analysis” gets over 32,000 hits in Google. We're all using it right?
RMA scheduling absolutely requires you know how often each task runs. Do you? That can be awfully hard to determine in a real-world application responding to unpredictable inputs from the real world.
To guarantee RMA will work in any particular application, you've got to insure the sum of execution time/task period is under about 69%. That implies we know each task's real-time behavior.
Few of us do.
In assembly, with simple processors devoid of cache and pipelines it's tedious but not difficult to compute execution time. In C that's impossible. Even looking at the generated assembly sheds little light on the subject since calls to the runtime package are opaque at best.
One could instrument the code to measure execution times quantitatively. Change the code, though, and it's critical to repeat the measurements. Base a design on RMA and there are significant long-term maintenance issues where perhaps plenty of changes and enhancements can be anticipated.
Recently a friend extolled the virtues of RMA. He has yet to actually use it in a real embedded system of any significance. In my travels around the embedded landscape it's rare to find a group using it. Do you? What's your experience?
Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at . His website is .
I speak for myself on this. However, I believe many here are, most likely, in a similar camp as I. RMA, can mean other things. What I mean, is other than Rate Monotonic Analyses. I believe most folks, (me included) , fall under the camp of R esource M anagement and A ffective behavior. I say this because many of us tend to wear hardware hats too.
In general, if we find that part of our system falls under the category of “Bad things Will Happen if this does not Work”, then our tendency is to manage such a subsystem by allocating more resources to it. The 'it' can mean any number of things. More memory, more hardware, more time etc, etc.
Much of our efforts are directed in this fashion. We come up with schemes to:
– Make the most with finite resources
– Add more where needed
– Make decisions to minimize risk
– Determine degraded behavior
– Design test metrics to assess behavior
Even in spite of the best thought out architectures, and the best laid our plans, from time to time, our customers do get the benefit of finding the pitfalls for us.
But that is for another discussion eh?
– Ken Wada
Jack… or anyone else for that matter, has anyone heard of an RTOS based upon genetic algorithms? A response from you or your readers would be appreciated.
– Steve King
In the (too many) years I marketed and sold RTOSs (pSOS, pSOS+, VxWorks, Linux, RTXC) I only ever came across a thorough understanding of RMA from prospects in Automotive and sometimes in Aerospace designs. In such systems safety is critical and proveability thus becomes critical too. RMA does provide a potential route through to designing such capability into a system.
However, I think there is one flaw in your argument for RMA that undermines some of what you follow through with. You imply that in a hard real-time system the entire codebase has to be capable of delivering hard real time, or at least be measurable. In my experience, for the majority of systems, only a small subset of each system actually needs hard real time (specific latency control, deterministic performance etc), so the role of an RTOS is to provide this capability as well as a general platform for the rest of the application(s), delivered in such a way that resource control can be clearly prioritised to those hard real-time activities. (an ECU is an exception to this as an example)
The reason most systems based on 'standard' prioritised pre-emptive environments (not round robin as you stated at the beginning, who uses that? except Unix based systems) work just fine is that electronic engineers close to the metal tend to do the really hard real-time software, and then rely on the RTOS to enable pure softies to get on with their job without undoing the good work of the hard rt sw programmers.
I remember well in an HDLC driver for X.25 which I wrote in Coral66 in the 80's, I counted every instruction cycle through the worst case execution route by hand (no tools for this back then) and calculated worst case system loading. That was about 5% of the code on the system, no-one else in the system had to go to those lengths for performance measurement, they did not need to. The prioritised pre-emptive OS spared them that job.
However, I would generally support the fundamental argument that RT systems need to be better designed, indeed RMA may provide a route through to this As designers try to integrate ever more functionality onto a single device/processor, ever increasing portions of the code need to be RT and thus vie for the limited time resources. Its becoming nigh on impossible to be sure that all the parts of the system that need hard rt sw support will get it when needed as the worst case scenario becomes harder to define. The hardware however continues to alleviate this issue. For example now the entire HDLC stack and a good portion of the x.25 stack would be delivered in hardware now; with asics/soc parallel processing capabilities and dedicated resources the rt sw problem can be designed out for that part of the system.
So whats the real issue? has the amount of rt sw in highly integrated devices got to the point where we cannot rely on the skills of the close to the metal electronics engineers anymore? Is the problem domain now so large that we need a new approach, one designed from the ground up to design in rt from the outset?
Why did the OS come about? because the code base became too large for a single engineer to code and develop, multiple engineers needed to work on a single processor, and they now needed to share resources. the OS delivered that. Perhaps the rt sw portion of a system has now reached a similar crux point due to the massive amount of I/O integration onto a single processor? But of course the processor providers have indirectly recognised this….why is 66% of the effort of a processor company these days in sw? because they know they need to address the integration problem for their processors to be adopted. They deliver pre-integarted sw, but has this addressed the real problem? to be honest I am not sure for the general market case.
– Geoff Revill
RMA is really just a subset of Fixed-Priority Scheduling, FPS, analysis. IT is a well-developed set of theory that has seen some notable use in European automotive. RMA itself is really a bit too simple to be useful, but FPS can handle more different scenarios and especially priority-driven distributed systems.
It is a natural match for a CAN network.
Seems like the future of hard automotive real-time is back to static scheduling, however. Considering the success of TTCan, TTP, and now FlexRay. All of which rely on offline static scheduling for critical communications. Event-driven analysis using FPS does not feel as comfortable as pure static scheduling, simply.
– Jakob Engblom
I do agree that RMA could be tough to design if you are not aware of all the inputs to the system and their frequency. I have developed one hard RTOS based aerospace application where RMA scheduling algorithm was used in Honeywell. And to be frank, it was such a nice system, that it needed least maintenance. So I would hesitate to agree that it needs maintenence. As you only stated that you should be aware of all the inputs to the system with frequency and time to process. If we know that and have designed accordingly, least maintenance is required. I agree that their are couple of bottlenecks involved in cache flush, DMA access and I/O processing which are tough to anticipate prior. But if we take them in consideration and desing the system approximately, we still maintain the timings. Believe me, the system which we developed is still running without any complaint and hardly maintenance.
Couple of good things about RMA that once you know what your system's resources and task are meant for, you can approximately anitcipate the timings of the task. Which are hard to do in priority based preemptive scheduling. The frequncy of the tasks and inputs is well known and could be designed accordingly. You can design to handle the unknown inputs or events generated by the system and can minimize their impact on system performance. So if you have very solid design, you can get maximum out of RMA.
– Sagar Borikar