The embedded systems used in consumer, medical and industrial applications often require real-time response to provide an effective user experience.
Whether a smartphone’s baseband radio communications, ultrasound image processing, or production line video inspection, all of these and many other such systems need to process inputs quickly and get some information or action back to the user whether human or another machine.
These systems run on low-power processors and often do all of their processing with relatively small amounts of memory—a combination of requirements that often leads developers to use a real-time operating system (RTOS). The RTOS manages application tasks or threads, handles interrupts, and provides a means of interthread communication and synchronization.
RTOSes come in all sizes and flavors, from the large, like Wind River’s VxWorks, to the super-compact, like Express Logic’s ThreadX. Robust RTOSes offer many features adapted from desktop systems that are typically not available in compact RTOSes because such features execute a larger amount of code that takes a larger memory footprint and causes a slower real-time response.
In contrast, the compact RTOS generally operates as a library of services, called by the application through direct function calls. Underlying these RTOS services is an infrastructure (Figure 1, below ) of scheduling and communications facilities that support these functions.
Figure 1. The compact RTOS generally operates as a library of services, called by the application through direct function calls.
Most “small footprint” RTOSes employ an architecture in which the application code is directly linked with the RTOS services it uses, forming a single executable image (Figure 2 below ).
The application explicitly references the services it needs, using function calls with an API defined by the RTOS. These service functions are linked with the application from the RTOS library. The result is a single executable image, usually in the .elf format.
For development, this image then is downloaded to target memory and run or, in production, it is flashed into ROM and run at device power-on.
This “monolithic” approach is efficient in both time and space, but it lacks flexibility. Any changes to the application or RTOS require re-linking and a new download/flash of the entire image. While this is routine during development, after production it can present some limitations.
Figure 2. Most “small footprint” RTOSes employ an architecture in which the application code is directly linked with the RTOS services it uses, forming a single executable image.
In contrast, desktop operating systems such as Windows and Linux, and larger RTOSes, such as VxWorks and QNX, have a two-piece “OS/Application” architecture.In this architecture, there is a resident kernel, containing all the OS services available to applications or that are needed by other services within the kernel, all linked into a distinct executable.
This kernel executable boots the system and runs continuously, providing a foundation for applications which dynamically load and run. Usually, virtual memory provides demand paging to and from mass storage on desktop systems or multi-user separation in embedded systems.
This approach is used in mobile devices such as Apple’s iPhone or iPad, where new “Apps” can be downloaded over the wireless network. The OS runs in the device and handles the user interface, which enables selection of any of the downloaded “Apps.”
The selected App then runs along with the OS on the CPU. Similarly, large RTOS-based systems segregate applications from the RTOS kernel, usually in distinct memory spaces, within a virtual memory environment.
A nice feature of the large RTOSes, shared by their desktop OS cousins, is the ability to dynamically download applications onto a running system. In such partitioned architectures, the kernel runs as an entity and the applications run independently, but makes use of OS services to access and use hardware resources.
Even within embedded systems, such downloading and the dynamic addition or change of applications is found where big RTOSes operate in telecommunications infrastructure and defense/aerospace applications. This capability enables a high degree of modularity and field update of running systems.
Tying Things Together
These applications are not monolithic and don’t link with the RTOS, so access to RTOS services take place via a “trap” mechanism.A “trap” is a software interrupt that results from any one of several mechanisms that vary across different architectures (Figure 3 below ).
It typically is used to catch errors before they propagate or to stop the system if it’s impossible to perform the requested operation. A divide by zero or the loading of unaligned data illustrate possible illegal operations that would activate an interrupt handler, just like an external interrupt. In other cases, an intentional trap can be generated through an instruction such as a “software interrupt” or “swi” instruction.
Figure 3. A “trap” is a software interrupt that results from any one of several mechanisms that vary across different architectures
When it is necessary for the OS to handle an application’s request for an OS service, one of these mechanisms can be used to intentionally cause a trap.For example, when an application wishes to use a service that is in the kernel and not able to be called directly from the application, the application instead can cause a trap.
The trap handler then examines a register (usually loaded by the processor with an identifiable value immediately prior to generating the trap) to determine which event caused the trap.
Once the handler sees that it’s a service call, it examines other registers (loaded by the requesting module) and determines which service is being requested. It then gathers the parameters for that service from the registers pre-loaded by the application. Finally, the trap handler calls the service (which is local to its executable) as a linked function, with the specified parameters.
This whole process requires interrupt servicing, some processing, a function call, and then the same in reverse. For a desktop or large RTOS, this overhead is insignificant compared to how much other code it continuously executes just to keep housekeeping in order.
The overhead is also reasonably unimportant in light of the lack of urgency for the system to get things done. The larger OSes are comparable to the sluggish nature of a tractor trailer’s acceleration and maneuverability.
It’s not a problem, because most of the time, the truck is cruising on the highway unlike a car which primarily needs to get going quickly and to be able to maneuver in traffic.
Within a larger system such as Linux, Windows or larger RTOSes, function call overhead is insignificant because they have no requirements for that fast low-overhead maneuverability.
But, for small RTOS used in hard real-time applications, low overhead response is counted on and a trap interface represents an undesirable amount of overhead, relative to the otherwise super-efficiency of the directly linked access method.
Getting the Dynamic Advantage to the Small Footprint RTOS
With a small footprint RTOS, the trick is getting dynamically downloadable applications via to efficiently call for RTOS services embodied in a distinct, separately linked piece of code. What’s needed is an interface that delivers services efficiently to applications, while also offering the advantages of dynamic downloading.To provide such an architecture for small RTOS systems, you need an “application module” structure (Figure 4, below ). Application modules are collections of one or more application threads, not linked with the kernel, but instead built as a separate executable that is loaded into target memory when needed.
Figure 4. To get dynamically downloadable apps in a small RTOS system, you need an “application module” structure.
The modules use kernel services via an interface with the module manager, an agent within the kernel image that loads and initializes a module as well as fielding all module requests for RTOS services.
Threads within modules make service calls exactly as they would make calls if the service function were directly linked with the application. In the module, however, these calls are handled by an interface component that communicates with the module manager. The trap mechanism is avoided, enabling a low overhead service call interface.
Although there is only one copy of the module manager, there is no limit on the number of modules that can be loaded at the same time, and no limit on the number of threads in any one module. In this manner, the kernel resides in a distinct execution entity, running continuously to serve module requests.
As shown in Figure 5 below , for maximum efficiency, application threads alternatively can be linked with the kernel and reside with the kernel in target memory as part of its executable image.
Figure 5. In the DAM architecture, application threads alternatively can be linked with the kernel and reside with the kernel in target memory as part of its executable image.
While this option avoids the need to reload the modules containing these threads, and providing the most efficient interface, it increases the size of the resident kernel image, leaving less memory for use by modules, and foregoes the opportunity to dynamically replace these application threads.
Downloadable application modules enable the RTOS to dynamically load and run additional application threads beyond those linked with the kernel. Applications gain increased functionality without the cost of an increased footprint or additional memory, and while retaining an efficient service call interface.
This technique also provides on-demand reconfiguration and updates for deployed systems. Downloadable application module technology ideally suits situations where application code size exceeds available memory (Figure 6, below ), when new modules need to be added after a product is deployed, or when partial firmware updates are required.
Another advantage downloading separate application modules is that each module can be developed by its own team or individual programmer. The team can then focus on one aspect of a product’s functionality, without having to be concerned with the underlying details.
Figure 6. The DAM approach is best suited for situations where the app code size exceeds available memory, when new modules need to be added after a product is deployed, or when partial firmware updates are required. (To view larger image, click here ).
A disadvantage of such an approach is the increased risk of malfunction when the module is inserted into the system and has to interact with the kernel and other modules.
Developers have increased concern about whether the external modules can be trusted to “play nicely” with the other “children.” Worse than a program bug that produces errors or causes the module from crashing is a bug that causes contamination of another module or the RTOS kernel itself.
Despite testing, there is risk of accidental stack or memory corruption due to an erroneously calculated pointer or array limit, or stack overflow. These faults can be catastrophic and difficult to find, especially if only one portion of the development team is familiar with the offending module.
The challenge gets even worse if such a failure is traced back to the offending code, which in this case resides within a completely different module. The extreme difficulty in tracing to the source of such catastrophic failures makes it all the more important to avoid such errors.
To protect systems from this type of accidental corruption, it is beneficial to prevent module threads (Figure 7, below ) from accessing locations beyond their own module’s memory, and to provide RTOS services that perform message passing instead.
Figure 7. To protect systems using DAMs it is necessary to prevent module threads from accessing locations beyond their own module memory.
This kind of protection helps prevent modules and the kernel from accidental corruption due to errant writes or jumps, or even errant reads if desired. Using the system’s MMU or MPU, memory boundary registers can be set to constrain each module’s code to accesses within its own memory.
Downloadable Application Modules provide a breakthrough for small-footprint applications, which require the reliability and responsiveness of a small RTOS, but also benefit from this new functionality to achieve an even greater range of features, maintainability, and modularity in their designs.
Further, with memory protection, any desired level of granularity—from one thread to an unlimited number—can be protected and prevented from unintended access, eliminating a common cause of difficult-to-diagnose program crashes.
John A. Carbone, vice president of marketing for Express Logic, has 35 years experience in real-time computer systems and software, ranging from embedded system developer and FAE to vice president of sales and marketing. Mr. Carbone has a BS degree in mathematics from Boston College.