After apologizing for certain misdeeds and offering a glimpse of the future, Jack asks whether an RTOS is even worth the trouble.
I begin this column with an apology. I did something bad, and I'm sorry.
If you were with me last month, you read about the disaster of my recent move and the fact that the movers had to wrest the keyboard from my hot, sweaty fingers to get the computer in its box. I suppose that's the reason they bunged it up during the move, trashing my RAM and perhaps my hard drives.
In that fateful column, I had promised to upload the “minimizer lite” version of the algorithm I've been working on for so long. I had checked out the algorithm in Mathcad, and verified that it worked beautifully, but I had not yet finished turning the changes into code. I figured, how hard can it be? The important deadline is to get the prose to the editors on time. Coding can take longer because I have more time in which to do it.
There's typically a two-month delay between the time I send the column to the editors at Embedded Systems Programming , and the time you get to see it. I figured it was a slam dunk that I'd get the code complete and uploaded before you saw the column saying that it was uploaded. In short, I floated a rubber check.
And it bounced.
Who would have guessed that it would take three months for me to get back online again? Usually it's a time measured in days. And who would have guessed that when I did get online, the data, including my partially finished code, would be gone to that bit bucket in the sky?
I cheated, and I got caught. Sorry. Trust me, I've already been read the riot act by my editors, and it's a lesson I learned well, the hard way.
The name of the column is “Programmer's Toolbox,” and my stated goal has been to provide you with tools and techniques to aid in the development of embedded systems. It occurred to me recently that you might be wondering what sorts of embedded systems I work on, that I need things like function minimizers, root crackers, and so on. Such routines seem more suited for huge math problem solvers than embedded systems.
Your embedded systems probably don't look much like mine. Even so, trust me on this: everything I've told you about came from the development of an embedded system. In fact, my interest in the minimizer came directly from a production, commercial product. That same system also used a real-time least squares fit.
The embedded systems that I work on tend to use a lot of math.
During the time I was occupied by the minimizer, many other topics attracted my attention, and were put on the back burner. Now that the minimizer is coming together, I've begun turning my attention to them. I thought you'd like to know what's coming up in the near future. Here's a partial list:
- The Nelder-Mead, or Simplex method (yet another minimizer!). This workhorse is probably the most-used minimizer for multivariate systems, and is definitely worth a look. But I'm putting it off for a time, mainly because I'm worn out with minimizers for the moment.
- Spline functions. Everybody and his brother has a spline function algorithm in his toolbox, and most of them work just fine. But I needed one that gave the derivatives of the function, as well as the fitted function. The canned routines don't do that. So I had to derive my own functions. In the process, I learned that the derivation isn't at all easy. The spline algorithm seems to be another one of those things, like Brent's method, that everyone uses but no one (or, to be more precise, few people) has bothered to derive for themselves.
- Optimal curve fits. This is not the same thing as a spline fit, because we don't necessarily require that the fitted function pass through the data points.
- The least squares fit. I covered this one once before, years ago, but it's been a long time. It's due for a revisit.
- Chebyshev polynomials, which are used to improve function approximations.
- Laplace transforms: second nature to EEs, a mystery to everyone else, and gateway to z-transforms.
- The Kalman filter. Yes, I know, others have done this in Embedded Systems Programming (“Kalman Filtering,” June 2001, p. 72). As you might expect, my approach will be different.
- Nonlinear root finder. I've talked about this one several times during the effort on a minimizer, and I just happen to have the world's best algorithm in my back pocket. It's only fair that I share it with you.
Is it possible that such things are in embedded systems? Yes and no. In once sense, they must be, because I'm using these techniques. In another, though, it's worthwhile to mention that not all the software for an embedded system actually ends up in the system. When we're building an embedded system, we also need software to test it. Most days, that means a simulation that exercises the code as it would be run in the real world. Someday perhaps, I'll write an article about how to test real-time software. Or you can buy Jim Ledin's new book, Simulation Engineering (CMP Books, 2001), which covers everything you ever wanted to know about simulations. Suffice it to say here that often a very sophisticated simulation is needed to be sure that algorithms always work.
If you'd like to see me cover any other topics, don't hesitate to contact me.
And now for something completely different
But I don't want to talk about any of those things this month. Instead, I'm in the mood for a change of pace. I want to talk about real time operating systems (RTOSes). I want to because I'm seeing trends in the industry that I'm not sure I'm ready for.
Before we get too far into the notion of an RTOS, we're first going to have to define what we mean by the terms real time and embedded system . Make no mistake: I'm talking real-world definitions here, not pedagogical, theoretical definitions. In general, I tend to write about the things I've actually worked on in the past, and this column will be no different.
I used to think I knew what real-time embedded systems were. They were things like numerical process controllers, chemical plant controllers, and flight computers for things that fly or swim and blow stuff up. Now that we have computers in video recorders, telephones, and those glorified calculators called personal digital assistants (PDAs), the definition is beginning to get a little fuzzy. I used to teach a course in software engineering for my company, and the illustrator we used was a wonderful cartoonist. He drew me a picture of an engineer, running along beside a tank with a Teletype ASR-33. The caption was, “Embedded systems are hard to debug.” I mean that kind of embedded system.
Likewise, it's often been said that any system can be considered real time if your definition is loose enough. Many Unix aficionados tend to say that Unix and its clone, Linux, are real-time systems. But surely, they can only be defined so loosely. To paraphrase Bill Clinton, it all depends on what “real” is.
There are two kinds of people in the world: those who categorize things into two kinds, and those who don't. Most engineers categorize real-time systems into the “hard” and “soft” varieties. A soft real-time system is one in which tasks need to be done promptly, but not at a precise time or on a precise schedule.
When some folks talk about a system being “real time,” they mean that it must complete its work in a time that's short compared to whatever the user considers a reasonable wait. By that definition, any OS is an RTOS. Except perhaps the old batch processing OSes, which still exist in some business mainframes. Airline reservation and point-of-sale systems are “real time” in the sense that we'd like a response the same day. But that hardly qualifies as a real-time system in the usual sense, and most of us don't accept such a definition.
Soft real-time embedded systems might include printers, cellphones, PDAs, and digital video recorders.
Hard real-time systems must perform their duties either at regular, specified time intervals, or respond within time intervals that are tightly controlled, both in response time and in jitter -the variation in response from step to step. Such systems tend to control things and have digital filters inside. If the system doesn't do things on schedule, data gets missed and the filters don't work properly. Since most of the systems I work on end up flying, or otherwise calculating based on streaming data, they tend to be very, very hard.
I'll leave it to the academics to find definitions that are completely unambiguous. As one Supreme Court justice said concerning pornography, “I may not be able to define it, but I know it when I see it.”
Why an RTOS?
The first question we should ask ourselves is do we even need an RTOS? The first microcomputers didn't. When Intel first introduced their 8080 microprocessor (a real piece of work, by the way, which I've praised before as a breakthrough of biblical proportions), they wrote up one of those ads masquerading as an application note, showing an 8080 controlling a traffic light. Four pressure pads, six sets of light bulbs. That was the extent of their imagination.
You don't need much of an RTOS for a system to run a traffic light.
(Good as Intel was in their design of the chip, I don't think they ever, in their wildest dreams, envisioned personal computers, CP/M and 64KB Altairs, Osbornes, Morrows and Kaypros. It was the hobbyists that saw the potential of the 8080 and the later chips like Z80, 6800, and 6502. Guys like Gary Kildall, Bob Albrecht, Dennis Allison, Carl Helmers, and the two Steves: Jobs and Wozniak.)
A few years ago, I attended one of Robert Ward's conference presentations on the taxonomy of real-time, embedded systems. I won't begin to try to fill Robert's shoes, but the “RTOS” for that traffic light controller was right there at the beginning of his taxonomy. In pseudocode, it is:
do it again
Or, if you prefer something more structured:
loop forever do somethingend loop
That's it. No multitasking; no interrupt handler; no TCP/IP stack. No priority queues. No classes. No persistent objects, except in the sense that everything was persistent, until you turned the power off.
The first program I wrote for the 8080, back in 1976, worked this way. It was a controller for a satellite-tracking antenna and it worked by performing a Kalman filter on the tracking data. If I'd known enough in those days to write pseudocode at all, it would have been:
loop forever take a measurement update the state point the antennaend loop
Maybe I should have thrown in some initialization, but really, that was about it. Timing was not an issue. You took a measurement as soon as you could, which meant as soon as you finished processing the last one. What else did the CPU have to do?
About the same time, I also wrote a program for an even earlier chip, the 4-bit 4040, to control a cold forge machine. It looked like this:
while !E-stop loop drop a blank into place move the anvil to its position whang it with the forge retract them both (finished part falls out)end loop
Oh, there were a bunch of safety related tests, like checking the Emergency-stop button before every operation, and also readings of the control buttons and sensors, to tell if we were doing the right thing. But basically, that was the program. Again, we had no need for even a timer, because as soon as the CPU was finished with one loop, we wanted it to do it all again.
You'd be surprised how often this simple structure works. Back before the Internet, there were bulletin board systems (BBSes) and CompuServe. Ward Christensen developed both a program and a protocol for transferring data over phone lines, called Modem7. It worked pretty doggone well, better in some ways than AOL. (Think of it: no flashy pop-ups, no self-regenerating porn ads, no computer viruses attached to e-mail.)
The problem with Modem7 was the classic one of chicken vs. egg. Though not at all large by modern standards, Modem7 was not the kind of thing you wanted to type into your assembler. You needed to download it, and before you could do that, you needed Modem7. Catch-22.
To get around that little problem, they came up with a bootstrap program called Boot7, whose only function was to download Modem7. I had a listing for Boot7, but somehow I made a mistake typing it in, so I couldn't get online.
To at least get me talking, I wrote my own modem program. Here it is, in all its glory:
char key = ' ';while key != ^Q loop if (char in serial buffer) c = getchar write c to CRT endif if (key == inkey) send key to modem endifend loop
Incidentally, thanks to heavy use of well-done BIOS functions by Teletek, the manufacturer, that little gem used exactly 19 bytes of Z80 assembler language. Not bad, eh? No megabyte RAMs needed, thank you very much.
Later, I added a rudimentary-not even circular-text buffer. Acting on a control key (probably ^D), I'd start capturing downloaded text into the buffer (being careful not to overflow it-a caution modern programmers seem to sometimes forget). I'd run the program under DDT, the CP/M debugger, and after I was done, I'd write the buffer to a file. The program kept a character count to tell me how many sectors to write. Not elegant, I admit, but it let me download Boot7, with which I downloaded Modem7, with which, well, you get the picture.
Is there a message in all these war stories? You bet there is: “Don't use an RTOS when you don't need one.”
On being on time
In the aerospace industry, most gadgets are just a smidge more complicated than the simple, loop-as-fast-as-you-can systems I've described. The main issue that characterizes control systems in the aerospace world is time.
Typically, an aerospace system reads things like gyro and accelerometer outputs, and uses them as the source for steering commands. As I hinted earlier, such systems tend to use digital filters of a fairly high order. What characterizes such filters is that the measurements must be made as on-time as one can manage. The filters depend on measurements taken at regularly spaced intervals.
To accommodate that requirement, we added a real-time clock (RTC), nothing more than an interrupt that fired on a regular schedule. In pidgin Ada, the structure looks something like this:
loop accept RTC read sensors update filters correct steeringend loop
In this Ada-ese, the program may seem as though it's in a continuous loop. I mean, that's what loop…end loop usually means, right? But appearances can be deceiving. See that accept RTC in the first line? That identifies the code as an interrupt handler. In Ada task-speak, the handler task is blocked until the interrupt is received. When it's received, the handler runs to completion, then sits and does nothing until the next clock tick. There is no real loop there, only an interrupt handler that runs each time it's pulsed by the interrupt.
If the CPU is idle until an interrupt comes in from the RTC, what does it do in the meantime, twiddle its thumbs? No, it executes the background task, whatever that is. The pseudocode for that task often reads:
loop perform self-test report errorsend loop
This one is a true loop, in that it runs as fast as it can. In short, it uses all the CPU resources not absorbed by the interrupt handler. In addition to the self-test, we often give it slow tasks to run, and these can, and often do, include a Kalman filter. Kalman filters tend to be slow and require floating-point arithmetic. In the old days, floating point usually meant software floating point, so it was slow. Unlike other digital filters, the Kalman filter doesn't much care about timing; it takes the time of the measurements into account. So we let the Kalman filter take as long as it needed, chugging along in the background whenever the fast stuff wasn't running.
For the record, the code for that multivariate function minimizer ran in the background. It took as long as it took, which wasn't very long on a fast 486 with its hardware math processor.
I chuckle every time I see Unix/Linux described as a multitasking OS. Ditto for Windows. I suppose they are, in the lingo of computer science, but they certainly aren't what we embedded folks thought of as multitasking.
To an aerospace software engineer, multiple tasks typically mean cyclic tasks, each running at different, but fixed, rates. This approach was taken simply for expediency and out of an acceptance of the facts of life; our CPUs weren't fast enough to do everything at the fastest interrupt rate. So we ran only what needed to be done fast at the interrupt rate. Everything else was counted down from that fastest rate. In a typical multitasking system, we had cyclic tasks at rates like 1,000Hz, 500Hz, 100Hz, 20Hz, 10Hz, and 1Hz. The RTC ran, of course, at the 1,000Hz rate. Its pseudocode might be:
loop accept RTC do 1000Hz if divide_by(2) do 500Hz endifend loop
The 500Hz task, of course, would divide by five to call the 100Hz task, and so on. Simple.
Whenever you have multiple tasks, you confront the issue of sharing data between them. Because tasks can be interrupted by higher priority tasks, we must make sure that the data
doesn't change between accesses. Typical solutions include semaphores and mutual exclusion mechanisms.
The beauty of a cyclic scheduler is that it rarely needs such stuff. If the 1,000Hz task is running, it can be certain that slower tasks aren't. Therefore, their data is not going to change during the life of the faster task. If it needs data from them, it simply goes and reads it.
The converse is not true, but easily handled. When I'm running in, say, the 1Hz task, I have to be sure that the data I'm operating from doesn't change during my computation. So the faster task is not allowed to stuff data into the slower one; the slower one must go get it, and make a local copy. Better yet, the faster task simply passes the data as parameters:
do 500Hz(data1, data2, …)
The compiler takes care of making the local copy.
Computer science purists will cringe at this one, but, in a pinch, I've been known to suspend interrupts during the time I'm fetching or stuffing data. It's hardly elegant, but it's fast and simple.
Build or buy?
Whenever I start a new project involving real-time systems, my first task is deciding whether to buy a commercial RTOS or “roll my own.” It's not a decision to be sloughed over.
But whatever decision I make, I find that its consequences are similar to deciding whether to use someone else's code: whichever way I go, I find myself wishing, at some time or another, that I'd gone the other way.
If I roll my own RTOS, I'm faced, not only with writing my own RTOS, with all its potential for hidden and subtle bugs, but with justifying my decision to management when the task takes longer than I expected. If I use an off-the-shelf RTOS, I find myself with a system that's much more complex than I usually need, and I'm spending time climbing the learning curve when I could have been writing code. Neither approach is completely satisfying or free of risk, so we should weigh the pros and cons carefully.
Leaf through the pages of this magazine, and you'll find ads from a number of RTOS vendors. They're our advertisers, bless 'em, and all of their products are good; some are great. Most of them do their job as advertised.
Even so, I find that almost all give me more gizmos than I really need. They include all the features that users clamor for, such as the ability to dynamically create and remove tasks; sophisticated priority mechanisms, including dynamic adjustment of priorities; multiple methods for passing messages and/or data between tasks, and so on. Much of that power is wasted for many of the problems I deal with. Not only am I carrying extra baggage around in the form of mechanisms I don't really need, but even the things I do need take longer, thanks to mechanisms that are more general than I require.
Then there's the frustration factor. Only last year, I was listening to my colleagues bemoan their fate, because the device driver that came with a commercial RTOS was broken. They couldn't get the vendor to even admit that it was broken, much less get a commitment as to when it would be fixed. A word to the wise: if you use an off-the-shelf RTOS, get source.
Until the last few years, my build/buy decision always came down on the build side.
It's true that cyclic executives are a special case, and perhaps deserve special treatments. Today, the trend is much more towards asynchronous tasks, and especially to tasks that are dynamically created. I know that. Even so, I'm not completely convinced that complex solutions are necessary.
As recently as 1994, I was working on a contract to develop real-time software for yet another antenna controller. The customer had chosen the Motorola 68332 chip-an excellent choice, in my opinion-and I was helping him make the build/buy decision. I looked at the features the 68332 provides:
- Built-in RTC with programmable rate
- Built-in watchdog timer
- Built-in serial and parallel ports, with programmable interrupt behavior
- Built-in background debug capability
- Built-in counter-timer chip with ability to run independently of CPU
I looked at all those features, and thought, “Dang, all I have to do is write the interrupt handlers, and the RTOS is mostly done.” That particular build/buy decision was easy.
During that decision-making process, I talked to a number of RTOS vendors. Most were extremely helpful and savvy, and I wouldn't have had a problem using any of their products. One company (sorry, can't recall the name) offered the RTOS complete with source code in C, for $1,000. That's cheap. It's much less than what our four-man team burned in a single day.
It's hard to justify building your own RTOS when good stuff is available so economically. Even so, we felt that the time spent learning how to use someone else's OS would probably cost more than the time to write our own. So we did. And we never regretted it.
That was then, this is now
Fast forward to 2002. Now we have super-fast processors with tons of RAM and, often, hard drives for mass storage. The problems tend to be more complex than they used to, and involve a lot more asynchronous tasks. We have RTOS vendors out the gazoo, and virtually all the systems can be counted on to come with free GNU C/C++ development tools. So do we build or buy?
I think I'm suggesting that the issue is still not clear, and that you should make the decision with considerable thought. Some problems are so complex that the notion of writing one's own OS is too horrible to contemplate. Others are simple enough that 90% of the RTOS's capabilities are going to go unused (and you're going to take both a performance and a memory hit). Each decision needs to be made based on the individual situation.
In our family room, we have one of three TiVo video recorders. If you haven't tried one, you should. We almost never watch live TV anymore, except to check on the news. By recording everything, we can fast forward through the commercials and slow-mo through the fast action. Watch Lara Croft: Tomb Raider in slo-mo and I guarantee you, you will see things that went right past you the first time. Like those slick, super-fast reload clips for her guns. Angelina rules.
But the TiVo is most definitely not real time, even in the airline reservation system sense. It records TV onto a hard disk in real time, and pulls it back off, too. But the user interface is positively glacial. How long can it take to bring up a pop-up menu? Try a TiVo, and you'll find it's a lot longer than you thought.
What's the “RTOS” in the TiVo? It's Linux. Not real-time Linux (RTLinux). Just plain Linux. I think maybe each time I press a button on the remote, it's logging me in as a new user. Or something like that.
Why use Linux for a set-top box? Well, for one thing, it comes complete with the GNU tools, which surely must be a big plus to the developers. Then there are the databases, which contain a list of channels and their program lineup. An SQL engine to access them must surely be nice. Still and all, I think the choice was a lot more pleasant for the developers than for us poor users.
In previous columns, you've heard me say that I was looking forward to working with RTLinux. I've seen it working in the demos at the Embedded Systems Conferences, and it seems pretty nice. However, especially after seeing some of the uglier sides of Linux, I'm beginning to have second thoughts.
Recall that the developers solved the problem of RTLinux by first writing a real-time kernel. Then they overlay Linux, including the Linux kernel, on top of it. In effect, Linux and its kernel are running as an application on top of the real-time kernel.
It seems like a neat enough solution, and it certainly works well in demos. But think about it from a structural point of view. When one is developing a kernel for an OS, surely there has to be something better than developing two kernels, one on top of the other. Surely some efficiency has to be lost in the process.
You know me: I'm a big fan of Linux, just as I was of Unix, if only because it's a better way to go than the obvious alternatives. But I'm not a fanatic about it. I tend to judge systems by how well they work, not whether I like their heritage. And face it. The heritage of Linux is not exactly earth shaking. It's a clone of an old time-share system that was, itself, a clone of an even older time-share system-Multics. That Linux has come so far, and done so well, is a testament to the ingenuity and efforts of the people involved in its development. But the Multics/Unix heritage can be burdensome. Backward compatibility constrains solutions that might otherwise be solved in other ways.
Aside from its open-source nature, which is certainly a big plus, Linux has a lot going for it in that wonderful, GNU toolset. I'm just not convinced that RTLinux is going to be efficient enough. Hey, if I'm questioning the need for even an RTOS of any kind, I can't exactly recommend the heck out of RTLinux, can I?
Is there an alternative? Perhaps so. We have a couple of true RTOSes that are also in the open-source tradition. One is C/OS, from Jean Labrosse. The other is Red Hat's eCOS.
How's this for an alternative to RTLinux? Start with either of the true RTOSes. Add a compatibility layer that will emulate not the Linux kernel itself, but the view that the applications see of it. That way, Linux applications, including the GNU toolset, will work on the system, but we can also access the RTOS kernel for our real-time apps. Think about it.
At least one variant of real-time Linux (www.timesys.com) does not use the kernel-atop-kernel approach. I just learned about this one and will be looking into it in the near future. If any readers know of other alternatives, please e-mail me.
I'm going to close this month's offering with a really slick little real-time, cyclic exec that I ran across years ago, and fell in love with. It's just like one I mentioned earlier, with one important exception:
loop accept RTC do 1000Hz enable interrupts if divide_by(2) do 500Hz endifend loop
See that third line, enable interrupts ? That's the key to the whole thing. Once the high-speed stuff is done, the interrupts are enabled so that another one can come in. Then the slower tasks can go about their business, happily taking their time, while the high-speed task gets interrupted again.
In general, when we write an interrupt handler, we want to leave the interrupts disabled for the minimum amount of time. This reduces jitter to a minimum. But in this case, the entire program is, in effect, an interrupt handler! It works because interrupts are enabled again, as soon as possible. The tasks are themselves reentrant, so all of them, except the bottom level, can be called more than once.
Customers of mine have often had trouble understanding this architecture, and I've tried to think of simple diagrams to show the behavior of the design. The best I've been able to come up with is the analogy of a pinball machine, with countdown counters in place, as in Figure 1.
Figure 1 Real-time pinball
Imagine that a ball enters the pinball machine at the top, passing through the 1,000Hz task. It then drops through to the bottom. But every second time, it's diverted to the 500Hz task. Similarly, in the 500Hz task, most of the balls fall through, but one out of five is diverted to the 100Hz task, and so on.
Once the system gets going, we can conceivably have different balls in various stages, passing through every single one of the tasks. If the code is truly reentrant, as it should be, we could even have multiple balls in a given path. Our only requirement is that the balls (interrupts) don't come in so fast that the tasks choke on them. The entire program is, in effect, an interrupt handler, and every bit of it is reentrant.
Remember how an interrupt works. When an interrupt comes in, the current context is saved on the stack, and the handler starts to work. In this case, if another interrupt comes in, that context is also stacked, and yet another instance of the handler is started. We only require that the average rate of interrupts is low enough to keep the stack from overflowing.
It's a slick solution, and I've seen it used in more than one life-critical system. In those systems, we used switches to make sure that task overruns didn't occur. Each was a simple semaphore that got tested and set at the top of each task, and cleared at the end. But even that mechanism is more stringent than it needs to be. As you can see from the discussion of stacked contexts above, multiple instances of a given task can be working, as long as the stack doesn't grow forever by too many interrupts. A simple test on the stack depth can detect that condition.
The design has the features I mentioned earlier: passing data between tasks takes a minimum of handshaking, and, in the low-to-high speed direction, takes none at all.
Keep it simple
So what's my point? Am I trying to put all our RTOS-vending advertisers out of business? Am I anti-Linux? No, not at all. I'm simply saying, let the punishment fit the crime; let the solutionfit the problem.
I'm a big believer in the KISS principle(Keep It Simple, Sam). Albert Einstein said, “Things should be made as simple as possible, but not any simpler.” Or, as the great race car designer Harry Miller put it, “Simplify, and add lightness.”
Maybe that new embedded gimcrack you're building doesn't really need full-up Linux, or RTLinux. Maybe a good, solid RTOS will do. Maybe it doesn't need an RTOSa cyclic exec will do. Maybe even a simple while-loop and some interrupt handlers will do.
Don't be in such a hurry to start, that you forget to do your build/buy study. You get no points for building the world's most complex alarmclock.
Jack W. Crenshaw is a senior software engineer at Spectrum-Astro in Gilbert, AZ. He is also the author of Math Toolkit for Real-Time Programming , from CMP Books. He holds a PhD in physics from Auburn University. Jack enjoys contact and can be reached via e-mail at .