The OSEK/VDXStandard: Operating System and Communication
Bran Selic
As the great RTOS debate continues to unfold, another dimension has recently entered the discussion. In addition to deciding whether to write your own RTOS or purchase a commercial one, developers of embedded systems must now decide whether the RTOS they ultimately choose should be compatible with the OSEK/VDX standard. The OSEK/VDX standard actually comprises three substandards-an operating system standard (OS), a communication standard (COM), and a network manager standard (NM). In addition, an OSEK/VDX implementation language (OIL) has been defined. In this first installment of a two-part article, I will provide an introduction to the OSEK/VDX standard, its history,operating system and communication substandards, and target products. The second part of the article, to be published sometime later this year, will focus on the network manager and implementation language components. After reading the entire article, you should be able to determine whether compatibility with OSEK/VDX is an appropriate requirement for your next RTOS.
Background
The OSEK/VDX standard is a combination of standards that were originally developed by two separate consortia and later merged. OSEK, which draws its name from a German acronym that translates approximately to "Open systems and the Corresponding interfaces for Automotive electronics," was founded in 1993 as a joint development effort of the German companies BMW, Bosch,Daimler Benz (now Daimler Chrysler), Opel, Siemens, and Volkswagen and the University of Karlsruhe, Germany.
VDX, which is an acronym for Vehicle Distributed eXecutive, was originally defined as part of a joint effort by the French companies PSA and Renault. The VDX group merged with the OSEK group in 1994. Today,many other companies from different sectors of embedded systems development have joined the OSEK/VDX effort. The list of member companies includes such key players as Hewlett-Packard, Motorola, NEC, and Texas instruments. For simplicity, I'll refer to the joint standard as OSEK for the rest of this article.
The increasing cost of software development motivated the creation of the standard. As the number of microcontrollers in automobiles and other complex systems increases rapidly, the need for software developers is increasing faster than colleges can turn out qualified graduates. The original members of the committee recognized that there were high recurring costs attributed to nonvalueadded software, including the operating system/kernel, the network management,and the I/O processing. The goal was to define a standard architecture,and a standard API, which could be used by any automotive OEM or supplier.
With a standard architecture, colleges and universities can train engineers and reduce the cost and risks to companies in the industry. The barriers to changing from one microcontroller to another was lowered by allowing highly portable software to be written to the OSEK API, and not to a unique OS. (This includes traditional commercial RTOSes that may not be available for the new microcontroller.)
Originally, OSEK was targeted as a standard open architecture for automotive Electronic Control Units(ECUs) distributed throughout the vehicle. However, the resulting standard is generic and does not limit usage to an automotive environment. Consequently, this standard can be used in many stand-alone and networked devices, such as in a manufacturing environment, household appliances,intelligent transportation system devices, and so forth.
OSEK architecture
An application that uses the OSEK architecture can take on a few different forms. The two basic forms utilizing all components of the standard can be seen in Figures 1 and 2. I describe each component in detail later in the article. The difference between the two forms is how the application handles the interface to the hardware.
In the first form, the application addresses the I/O layer directly through an I/O API. This API isn't defined in the OSEK standard due to the varying requirements of different applications. This form has the advantage of rapid response to a request for I /O information from the application task. The drawback is that portability of application tasks may be limited.
As an example, consider a device that requires input of vehicle speed. In some applications, this input may be in the form of a pulse stream into the hardware. In this version, the information is obtained via a call to the I/O layer. In another version, the vehicle speed is obtained from another microcontroller over a network such as CAN or J1850. In this version, the information is obtained via a call to the OSEKCOM module. Due to this small difference,the application is not 100%reusable. If vehicle speed is used in many application tasks, the effort to port the software from one version to the next may be daunting (not to mention the possibility of errors).
The second form treats the I/O layer as an OSEK task. In this form,every application task requests information from and sends information to the OSEK COM module. Consequently,changes to the source or destination of the information requires one change to the COM message,which is automatically cascaded to every application task. The drawback is that the processing of the message by COM may take longer than processing the information directly from the I/O layer.
Other forms can be derived that use only some of the components of the OSEK standard. Each component can be designed to operate independently of the other components. In particular, the COM component does not assume that it is operating in an OSEK OS environment.
Operating system
The first component of the OSEK standard is the operating system. Many engineers have a common misperception that OSEK is an RTOS. Although the OS is a large portion of the standard,the power of OSEK comes from the integration of each component and the development of a standard architecture. The operating system is composed of a number of objects as shown in Figure 3.
The OS also provides error handling (used primarily during development) and hooks for user-defined functions to track changes in system state.
Tasks
In the OSEK OS, tasks can be basic or extended and preemptive or non-preemptive. The primary difference between a basic task and an extended task is whether the task can go into a waiting state (in which it is waiting for an event to occur). Only extended tasks can wait for an event. Basic tasks must run to completion unless preempted.Preemptive tasks can be preempted by a higher-priority task becoming ready to run or by an interrupt. Non-preemptive tasks can only be preempted by an interrupt (unless interrupts are disabled, as I shall describe later).
The concept of two types of tasks requires a new idea called a conformance class to describe the specific implementation of the OSEK OS and the system services that are available to the applications. Four conformance classes are defined: Basic conformance Classes 1 and 2 (BCC1,BCC2) and Extended Conformance classes 1 and 2 (ECC1, ECC2). As the names suggest, implementations that conform to the basic classes require only basic tasks, while implementations that conform to extended classes require extended tasks (in addition to basic tasks).
The numbers 1 and 2 in the conformance class names indicate the number of requests per task for basic tasks,and the number of tasks per priority for all tasks. BCC1 and ECC1 have only one task per priority and basic tasks can only be requested once. BCC2 and ECC2 allow multiple tasks per priority and multiple requesting of basic tasks.
Upgradability of the implementation is enabled by allowing tasks written for BCC implementations to run under ECC implementations, and tasks written for level 1 conformance to run under level 2 implementations. Table 1 shows the different conformance classes and the features supported.
The existence of different conformance classes allows development of a wide range of applications over multiple complexity control units while still allowing reuse of code. For example,consider a cruise control application like the one shown in Figure 4. This application is simple and would consist of just four basic tasks running in a small control unit-probably on an 8-bit microcontroller. The input processing task would execute periodically(based on an alarm, which I'll describe later) and sample and filter all inputs. The algorithm calculation task would calculate the desired state and speed of the vehicle. It would execute continuously as a low-priority task that can be preempted. The output processing task would execute periodically and control the throttle. The speed control task would be the highest-priority task, run periodically, and would determine the desired throttle position to meet the set vehicle speed.All tasks could be preempted. The OS would conform to BCC1, run with a single stack, and the scheduler could be implemented very simply.
Now, suppose that after this cruise control system is in production for a few years, the auto manufacturer decides to offer cruise control on all vehicles as standard. After investigating all of the control units on the vehicle,the manufacturer finds that the anti-lock braking system (ABS) module has enough spare throughput,memory, and I/O, and already has many of the sensors required to run the cruise control software as well.However, the ABS module is much more complex and is running at conformance class ECC2.
With OSEK, the effort to integrate the cruise control functionality would be minimal. Specifically, it would consist of the following five steps:
- Modify the input processing of the ABS to recognize the new inputs of the cruise control (command switches from the driver and throttle position). Brake switch and vehicle speed are already available
- Modify the hardware interface of the output processing to control the speed control regulator
- Define the algorithm calculation and speed control tasks directly in the ABS OS definition. No change to the source code is required
- Compile and merge into the ABS application
- Test
Due to the upward compatibility of the tasks and conformance classes within OSEK, the integration should proceed rapidly and with minimal risk.Without a standard like OSEK in place, the algorithm calculation and speed control tasks would probably have had to be completely rewritten for the ABS module's specific software environment.
Each OSEK task must be in one of only four states-suspended, ready,running, or waiting. As I mentioned earlier, only extended tasks can enter the waiting state. The four task states are defined as follows:
- Suspended: The task is not in the ready queue and is therefore ineligible to run
- Ready: The task is ready to run and the scheduler may choose to run it(based on its priority and that of other ready tasks, as well as the preemption rules)
- Running: The task is currently running.Only one task will be running at any given instant
- Waiting: The task is waiting for an event to occur
Each task also has a priority, with higher numbers indicating higher priority.The OSEK standard does not define a maximum priority. Each implementation is free to define its own. Tasks can be moved into the waiting state when one of the followingevents occurs:
- The task is commanded into theready state by an explicit task activation command (ActivateTask() or ChainTask() system service)
- An alarm expires that activates the task
- A message is received that activates the task
- An event upon which the task is waiting occurs
Tasks in the ready state reside inthe ready queue based on priority and are executed on a first-in, first-out basis.
Tasks move to the suspended state upon termination (ChainTask() or TerminateTask() system service), and are moved to the waiting state when an event is not available that is needed (WaitEvent() system service).
The task is moved from ready to running by the scheduler. The function of the scheduler varies based on whether the running task can be preempted. For non-preemptive tasks, the scheduler runs when one of the following occurs:
- A task is terminated (ChainTask() or TerminateTask() system service)
- The scheduler is called explicitly (Schedule() system service)
- An extended task transitions into the waiting state (WaitEvent() systemservice)
For preemptive tasks, the scheduler runs when one of the following occurs:
- A task is terminated (ChainTask() or TerminateTask() system service)
- An extended task transitions into the waiting state (WaitEvent() systemservice)
- A task is moved from suspended to ready (ActivateTask() system service)
- An event is set (SetEvent() systemservice)
- A message arrives that activates a task or sets an event
- An alarm expires that activates a task or sets an event
The scheduler is also considered a resource that can be locked, thereby inhibiting rescheduling during a critical section of the code.
Interrupts
OSEK defines three levels of interruptservice routines (ISRs). The difference between each level is whether OS system services are called. Level 1 ISRs run independently of the OS and execute the fastest. An example is transmitting a stream of serial data previously buffered, or driving a PWM output signal.
A level 2 ISR provides a frame in which an application function that contains an OS call is executed. An example of this level is the receipt of a pulse that must be processed immediately.
A level 3 ISR is a hybrid in which code that doesn't call an OS service coexists with code that calls a service. In this case, code that makes OS service calls is enclosed in two calls- EnterISR() and LeaveISR()(). An example of this level of ISR is the receipt ofa serial stream of data. The ISR knows how long the stream is, and buffers the stream until the end, at which time an OS service is called to send the message to an application. The only time that EnterISR()() and LeaveISR()() are called is after the last character of the stream is received.
After EnterISR()() is called, the level 3 ISR can activate tasks, enable and disable interrupts, set events as having occurred, and start, reset, and stop alarms. However, rescheduling does not occur until LeaveISR()() is called. The last statement in the ISR must always be LeaveISR()().
Interrupts may be checked, disabled, and enabled. Unlike a generic enable and disable interrupt routine provided by a compiler, this interface allows different interrupts to be disabled or enabled based on a mask that is sent to the routine. The interrupt descriptor is specific to the implementation of OSEK (due to differences in microcontrollers). However, a mask can be created by the application that can be configured for each OSEK implementation. For example, a mask called TIMER_INTERRUPTS might be defined to inhibit interruption of the task by the timer module. In implementationswhere there are no timers,this would be defined as zero; in other implementations, in which there are only timer interrupts, it may be the global interrupt enable mask; and instill others, it may be a combination of specific interrupts.
Events
Events are used to synchronize different tasks. Each event is "owned" by an extended task. Any task, including basic tasks, can set an event. Only the owner task can clear the event or wait for the event.
Listing 1 provides an example code segment to write to EEPROM. Task A is a high-priority extended task and is activated by a COM message. It begins the write of the message to EEPROM and creates another message that is sent to low-priority basic task B, whichmonitors the actual write to EEPROM. It then waits for the WriteComplete event to occur. At this time, the low priority task executes (if no higher priority tasks are pending) and monitors the write operation. When the write completes, it sets the event. Task A then preempts Task B, clears the event, and terminates. Task B then resumes and terminates.
Resource management
Resource management controls access to shared resources such as memory, hardware, and the like. The scheduler is a special resource that can also be locked by tasks. To eliminate priority inversion and deadlock, OSEK employs a priority ceiling protocol.This protocol temporarily increases the priority of the task that has locked the resource so that no other tasks that access the resource can be running while the resource is locked. However, all tasks with a priority higher than the highest-priority task with access to the resource can still run.
Alarms and counters
Alarms and counters are tools used to synchronize task activation with recurring events. An alarm is statically assigned to one counter, one task, and one action. The action could be either to activate a task or set an event.
Counters are measured in ticks and can represent time, number of pulses received, and so on. One counter, the timer counter, is provided by each implementation. This counter can be used to schedule periodic events. Other counters are manipulated through an API that is specific to each implementation of the OSEK OS. Consequently, counter control code written for one microcontroller and one vendor's OSEK OS would have to be rewritten if the software is ported to a different vendor's OSEK OS, but to the same microcontroller.
Two types of alarms are available: cyclic and single. Cyclic alarms can be used to schedule a task that must occur periodically. When an alarm is set, it can be set to a relative or absolute value of the counter. The value of the counter and the cycle can be dynamically allocated when the alarm is set. Consequently, a single alarm can be single, cyclic, set relative to the counter, and set absolute to the counter at different locations in the application.
An example of using alarms is inscheduling periodic tasks to activate. If there are four tasks-A, B, C, and D, all of the same priority-and each task needs to be executed every 40ms, four alarms could be set up and started during hook routine StartupHook(). Task A would be set to execute at a relative time of 0ms, Task B at a relative time of 10ms, Task C at a relative time of 20ms, and Task D at a relative time of 30ms. All tasks would cycle at 40ms. The effect of this is to limit the latency for each task from the time that it is activated until it runs. None of the tasks will have to wait on any other task, unless the task takes more than 10ms to complete.
Error handling, hooks,tracing, and debugging
OSEK provides minimal run-time error handling. However, during development, additional error handling is provided via extended return functionality. The reason is that after a product is released into production, most of the errors that could occur would have been detected during testing (such as "invalid task ID," "resource still occupied," "illegal call from interrupt level," and so forth). At run time, most system services return no error. However, some services, such as alarms that can be dynamically started and stopped, return an error if the alarm is already being used.
OSEK defines two types of errors- application errors and fatal errors. If an application error occurs, the ErrorHook() routine is called. Application errors are ones in which the internal data integrity is still valid, but the application tried to perform an illegal operation (that is, an attempt to activate a task that does not exist).
Fatal errors occur when the OS determines that the internal data integrity has been destroyed. They call the ShutdownOS() system service directly. The service then calls the ShutdownHook() routine with the error that occurred.
Hook routines are provided by the OS and are optional. If a hook routine doesn't exist, then the OS doesn't call it. Five hook routines are available and are described in Table 2. These hook routines are powerful tools for tracing and debugging the execution of the system.
The routine StartupHook() can be used to start alarms that schedule periodic tasks, activate tasks based on application mode, send initialization messages to other modules, and initialize application modules to a known state. The routine ShutdownHook() can be used to shut down application modules and clear out interrupts that you no longer need. The other routines are primarily useful for tracing and debugging purposes.
Communication
The communication specification (COM) provides an interface for multiple application modules to communicate via messages. In addition to providing for interprocess communication, COM also provides for communication between microcontrollers in a multiprocessor module as well as between controllers over a network. If implemented, the network may be one of CAN or J1850. (Theoretically,the network can be of any type, including Ethernet.) The application modules do not have knowledge ofthe physical location of the sender or recipient.
The COM model consists of a numbe rof layers that correspond roughly to the ISO/OSI seven-layer model. The layers, as shown in Figure 5, are: application, interaction, network, datalink, and physical. The interaction layer corresponds roughly to the presentation layer of the ISO/OSI model. The session and transport layers of the ISO/OSI model do not exist in the COM specification.
The interaction layer provides the application programming interface for COM. It consists of a small numberof interfaces that are used to send and receive messages, check status, and lock and release the message resource. This simple interface encapsulates a powerful system that greatly increases the portability of application modules. Messages that are intended for local processes are handled totally by the interaction layer. Messages intendedfor transmission over a network or to another microcontroller in the same module are passed to the network layer.
The network layer provides services to the interaction layer to transfer messages over a network. It will segment messages into frames if they are supported by the chosen COMc onformance class. If a message is unsegmented, it is passed directly to the data link layer. The data link layer handles the protocol of the message over the chosen network. Multiple data link layers may exist in a given application. I will not describe these layers in depth here because they are intended to be transparent to the application programmer.
COM operation
The communication model is asynchronous. Messages are sent and received in parallel with the application. Consequently, when a message is requested to be sent, the application doesn't yet know if the message was successfully transmitted. Since messages sent over the network will be in progress when COM returns control of the processor to the application, osek/vdx there must be a method to provide success or failure indications. COM accomplishes this indication by using specific OS functions: activating a task,setting an event, or through expiration of an alarm.
Since COM was developed independently of the OS subsystem, there is no requirement for an OSEK-compliant OS. However, the chosen OS must provide the previously mentioned functions.
Figure 6 provides an illustration of a typical message transaction for a sent message that transmitted successfully and activated a task.
Conformance classes
COM implementations are also defined by conformance classes. Table 3 outlines the different options available under each conformance class.
To ensure data consistency in the event of a preemptive system, COM provides messages that are sent and received with or without copy. The with copy option will make a copy ofthe data inside the COM routine, thereby guaranteeing that the data does not change midway through the transmission of the message. For messages defined without copy, the COM service will only update the information and will start a transmission. The lower layers will obtain the information directly from the data location defined for the message. Unless you are operating in a non-preemptive system, you should always use the with copy option. Messages defined as without copy typically use global variables, which is not a good practice.
Two types of message objects are available-queued and unqueued. Queued objects are received into and dispatched from a FIFO buffer. After it has been received, the message data is no longer available in the buffer. If no messages have been received, an error is returned. Queued messages arealways sent and received using with copy. Unqueued messages contain a single object in COM. When a message is received from an unqueued message object, the data still resides inthe object.
For network messages, a number of characteristics are available. The first is whether the message is segmented or unsegmented. Segmented messages allow the message to be split up into multiple messages and sent over the network. This isn't applicable to messages local to a single ECU. Unsegmented messages are always thes ame size and are always sent in one network message. Message size is defined as static or dynamic. Dynamic messages can change size during run time and are always transferred using segmented messages, while static messages may be sent using either segmented or unsegmented. Addressing of the messages may also be static or dynamic. COM limits static messagesto having static addressing, and dynamic messages to having dynamic addressing.
Network messages also have different transmission modes. Direct transmission mode requires that the application request the sending of the message. Periodic transmission mode automatically sends the message at a predefined period set at compile time. Mixed transmission mode sends the message periodically, but also sends a message immediately on update that creates a relevant change, even if the period hasn't expired. Relevant changes are defined as one of the following:
- New value less than a constant
- New value greater than a constant
- New value equal to a constant
- Change in value less than a constant
- Change in value greater than a constant
- Change in value equal to a constant
- Always upon an update
- Never upon an update (same asperiodic mode)
Intermediate message transmission doesn't reset the period timer.
Finally, message transmission and reception can be monitored by COM. If a transmitted message is monitored, an alarm is started when the message transmission request is sent to the lower layers of COM. The alarm is canceled when the transmission confirmationis received. If the alarm expires,the alarm actions will be taken. Periodic message reception may also be monitored by an alarm. Each time a message is received, the alarm is restarted. If the alarm expires, the alarm actions are taken.
Implementation details
Definition of message objects requires two steps. The first step is to define the message object itself. The OSEK standard doesn't describe how this can be performed. A typical implementation of OSEK COM may require the message to be defined with the following parameters:
- Message size in bytes
- Queued or unqueued
- Network or local
For messages transmitted over the network, the following parameters may need to be defined:
- Transmission mode (direct, periodic,or mixed type)
- Periodic or mixed mode value
- Static or dynamic size/addressing
- Size/address for static messages
- Monitoring alarm
For messages received from thenetwork, the following parametersmay need to be defined:
- Static or dynamic size/addressing
- Size/address for static messagesand address for dynamic messages
- Monitoring alarm
The second step is to define the usage for each task. The message maybe sent or received, with or without copy, and activate a task, set an event, or start or clear an alarm for each task using the message. As an example look at the system diagram in Figure 7. One message exists, A, that is sent by Task A.1 and is received by tasks A.2, B.1, and C.1. Only one task may be defined as sending a message. For Task A.1, the message is sent with copy and starts an alarm. Task A.2 is activated when the message is sent and receives the message with copy. Task B.1 receives the message from the network with copy and sets an event. Finally,Task C.1 receives the message from the network without copy and activates a task. Figure 8 shows the flow of information whenever a message is sent.
It would have been enough
If the OSEK/VDX standard only offered the operating system and communication standards I've described here, it would be a powerful base on which to build a system architecture for small, dedicated embedded systems. OSEK/VDX also offers the network manager and the implemetation language tools.
Joseph Lemieux is a senior applied specialist with EDS Embedded Solutions in Troy, MI. He holds an MSEE from the University of Michigan and has been writing software for embedded systems in the automotive industry for over 17 years. During this time, Joe has used both homegrown and commercial operating systems. He can be reached by e-mail at joe.lemieux@eds.com.