A SystemC-Based RTOS Model for Multiprocessor Systems-on-Chips: Part 1 - Basic Concepts and Terminology - Embedded.com

A SystemC-Based RTOS Model for Multiprocessor Systems-on-Chips: Part 1 – Basic Concepts and Terminology

With the increasing complexity of embedded systems and the capacity ofmodern silicon technology, there is a trend toward heterogeneousarchitectures consisting of several programmable as well as dedicatedprocessors, implemented on a single chip, known as systems-on-chips(SoCs).

As more applications are implemented in software that in turn isgrowing larger and more complex, dedicated operating systems will haveto be introduced as an interface layer between the application softwareand the hardware platform.

Global analysis of such heterogeneous systems is a big challenge.Typically, two aspects are of interest when one is considering globalanalysis: the system functionality, in general, and the system timingand resource sharing, in particular.

As many embedded applications are reactive in nature and havereal-time requirements, it is often not possible to analyze themstatically at compile time. Furthermore, for single-chip solutions, wemay need to use nonstandard real-time operating systems (RTOS) in orderto limit the code size and hence the memory requirements, or tointroduce special features interacting with the dedicated hardware,such as power management.

When implementing an RTOS, we may wish to experiment with differentscheduling strategies in order to tailor the RTOS to the application.For a multiprocessor platform, we may wish to study the system-leveleffects of selecting a particular RTOS implementation on one of theprocessors.

To study these effects at the system level, before anyimplementation has been done, we need a system-level model that is ableto capture the behavior of running a number of RTOS on a multiprocessorplatform.

We refer to such a system-level model as an abstract model. In thisfirst part on concepts and terminology and followed by succeeding partson uniprocessors (Part 2) and then multiprocessors (Part 3), wedescribe an approach to modeling embeddedapplications represented as multithreaded applications executed on amultiprocessor platform running a number of, possibly different, RTOS.

Figure 10-1 below illustrates a possible SoC design flow in whichthe application, modeled as a set of tasks, is mapped onto aheterogeneous multiprocessor platform.

Figure10-1

During the mapping, each task is assigned to an appropriateprocessor, and the scheduling policies of the RTOS are determined. Byexecuting the application, properties of the system such as latency andresource utilization can be analyzed.

Validation of multiprocessor RTOS is typically done afterimplementation through an interactive process of experimentation onprototypes, measurement of performance, tuning of various parameters,and so on, until the performance meets the system requirements. In amore desirable development process, an extra step of modeling andsimulation before (or during) implementation would shorten the timerequired for validation.

However,  there is a lackof complete methodologies and tools that cover all the aspectspertaining to the modeling of modern heterogeneous systems in asatisfactory manner. Several approaches have been devised that tend toprovide the designers with suitable modeling tools.

For example, one approach uses a methodology based on the principlesof composition for modeling real-time systems, although the challengesimplied by the modeling of real-time systems implemented onmultiprocessor platforms are not mentioned.

Another approach is based on the functional description ofmultiprocessor real-time systems giving modeling capabilities andautomatic generation of different system components, including theoperating system.

However, the focus of both of these methods is at an abstractionlevel lower than the one we propose and uses a high-level performancemodel for multithreaded multiprocessor systems. It is based on modelingthe layer of schedulers in an abstract manner, which resembles the aimof our approach.

Others have focused on providing RTOS modeling on top of existingsystem-level design languages (SLDLs), either for open languages suchas SpecC and the RTOS library of SystemC 3.0 or for proprietarylanguages such as SOCOS of OCAPI and TAXYS. All these approaches offerfunctional models of the RTOS, allowing its emulation, on top of whichfunctional models of software applications can be implemented.

In the early stages of design, even before a functionalimplementation exists, we can benefit from a methodology that canprovide us with design strategy guidelines, in our specific case,helping us to select the right combination of scheduling,synchronization, and allocation algorithms for the RTOS running onmultiprocessor platforms.

One methodology, based on abstract modeling, uses SpecC but it islimited to modeling preemptive schedulers for single-processor systems.

Our approach is based on an abstraction of the functionality of thesystem. We logically partition a multiprocessor system into itscomponents and devise a simple model for each component that willsimulate its functionality using a set of parameters. A framework tomodel abstract application software is them execited on amultiprocessor platform under the supervision of abstract RTOS.

The framework is based on SystemC, by the use of which we inheritall its properties, such as multiple abstraction levels, synchronousand asynchronous modeling facilities, predefined components, andprogressive refinement of models.

The rest of the series is organized as follows. In this first part,weintroduce the basic concepts and terminology of real-time systems, withspecial emphasis on scheduling as well as an overview of our abstractRTOS model.

In Part 2 in this series,there is a detailed description of the model based on theclassical uniprocessor system. In Part3 in this series , the uniprocessor model isextended to a more general multiprocessor model.

We will start the discussion in succeding pages with a presentationof the basic principles of scheduling, as well as, in particular, amodel for the tasks to be scheduled and a model of the platformarchitecture on which to schedule those tasks.

Platform Architecture
A platform architecture, Ak, can be modeled as a set of interconnectedcomponents. We distinguish among three types of components:

1) Processingelements (PEk ) which are active components capable ofexecutingone or more function(s). A processor may be a programmablegeneral-purpose processor, a programmable dedicated processor such as asignal processor, or a coprocessor designed for a specific dedicatedpurpose.

2) Devices (Dk ) which are fixed operation components. A device may be a memorycomponent or an input/output (I/O) component, for instance, a sensor, adisplay, or an actuator.

3)Communication (Ck ) which consists of communicationchannelsbetween components, i.e., processing elements and devices.Communication channels may contain nodes (N) and a single link (L), asin a simple bus, or they may contain a complex structure of nodes andlinks, as in a network, such as a double torus network based onwormhole routing.

A component can be considered as a resource that may be in use whenexecuting a task ti . Here we assume that aplatform architecture is characterized by the resources used by thetasks.

Figure10.2.

Tasks
A task,t i , models afunction (or parts of afunction) to be executed and is the basic schedulable entity. A taskmay be implemented in software executing on a PE or directly asdedicated hardware, e.g., a coprocessor.

In this series we assume that tasks are software tasks, although ourmodel is not limited to software tasks. A task is characterized by its:

Timing constraints (Fig. 10-2above ), in which:

rk , is the release time at which the k'thinstance oft i becomes active or available forexecution.

o i is the offset from thetime zero at which the first release of t i is initiated (this is alsocalled the phase).
 
s k is the start time at which the k'th instance of t i actually startsexecution

csw i is the context switching time for t i

wceti is theworst case execution time (WCET) for t i

bcet i is the best case execution time (BCET)  for t i

e i,k is theactual execution time of the k'th instance of t i .This may be calculated by a function returning a (random) value in theinterval [bcet i ;wcet i ] .

dk is thedeadline at which the k'thinstance of t i should be completed. The relativedeadline, Di , is constant andis calculated as dk rk .

T i ,is the period at which a periodic taskis repeated (only valid for periodic tasks).

Taskexecution frequency , whichmay be periodic, in which case, the period defines the time betweensuccessive task executions; aperiodic ,i.e., nonperiodic, typicallycorresponding to the tasks that handle the events from the environmentthat cannot happen at regular points in time; or sporadic, which isaperiodic but with a hard real-time constraint. From the point of viewof modeling, an aperiodic task may be modeled as a periodic task whoseperiod is a randomly generated value of the worst-case release orinterarrival time (WCRT).

Precedenceconstraints, which impose apartial order on the execution of tasks. Precedence relations may beused to model data transmissions between tasks, i.e., when one taskneeds the result(s) from another task in order to execute. It may alsobe used to express explicit synchronization. A precedence relation fromt i to t j isexpressed as t i > t j .

Resourceconstraints, which define theresources required by the task in order to execute. Hence, a task, t i ,which reads data from a sensor, Dsensor ,makes some computation, PE cpu ,and then stores the result in a memory device, Dmem, using the systembus, Dbus, will require all four resources to be available during itsexecution. We specify this resource requirement as r (t i) {PEcpu , Dsensor , Dmem ,Dbus } .

Task assignment, which defines themapping of a task t i   to aprocessing element PEa .This mapping isexpressed as t i,a in a short form.

Basics of Scheduling
From a system point of view, the major task of an RTOS is to determinethe execution order of the tasks, i.e., to perform task scheduling. Theprocess of mapping an application, represented as a set of tasks, ontoa platform architecture involves three processes, which are highlyinterdependent:

allocation , which determines the numberand type of processors and resources. Typically, this is static andgiven by the platform architecture. However, in case of areconfigurable platform, the process of allocation may be dynamic,i.e., performed during run time.

assignment/binding, which binds tasks toprocessors, i.e., spatial binding. This process may be static, in whichtasks are bound before running the application, or dynamic, in whichtasks are bound during run time. The latter allows for task migration,whereby tasks may be moved between processors between or duringexecutions.

scheduling, which determines theexecution order, i.e., temporal binding. This process may be static(off-line) or dynamic (on-line), as for the binding process.

In this series, we assume that allocationand binding have already been done as a part of the system designprocess, i.e., they are both static. Hence the problem we have to solveis that of scheduling.

Scheduling can be executed off-line, inwhich the schedule is obtained before the system is put into operation,i.e., at compile time, or on-line, in which the schedule is obtainedduring the execution of the system, i.e., at run time.

On-line schedulers are often list-basedin the sense that tasks are placed in a ready list as soon as they arereleased for execution. It is then the job of the scheduler to selectthe next task to execute among the ready tasks. The exact selectionstrategy is called the scheduling policy, and it is here that theschedulers differ.

An important classification ofpriority-based scheduling policies is:

competitive/preemptivescheduling , i.e.,the scheduler is allowed to stop the execution of a task, before it hasfinished execution, in order to execute another, possibly moreimportant, task and then resume the execution of the preempted task.

cooperative/non-preemptivescheduling , i.e., an executing task is allowed to run to completion regardless ofits importance.

Another classification is based on whenpriorities are assigned to tasks:

static priorityassignment, whereby taskpriorities are determined and fixed off-line. This is used in ratemonotonic scheduling (RMS), in which priorities are determinedbased on the periods of the tasks, i.e., short periods are givenhighest priority.

dynamicpriority assignment , whereby taskpriorities are determined and updated online. This is used in earliestdeadline first (EDF) scheduling, whereby the highest priority atany given time is given to the task with the closest deadline.

Where RMS is preemptive, EDF schedulingmay be both preemptive and non-preemptive.

Basic system model
At the system level, the application software may be modeled as a setof tasks which have to be executed on a number of programmableprocessors under the control of one or more RTOS.

Our system model is designed followingthe principle of composition and a proof rule for composing modularspecifications that consist of both safety and progress properties,based on a semantic model that is not tied to any particularspecification language or logic. It is a key tool for incrementalsystem construction.

A problem in specifying and verifyinglarge systems is how to modularize a system and then prove that thecomponents working together satisfy the overall system specification.This problem is compounded when a system is reactive and concurrent, asin a distributed operating system.

An advantage of composing modularspecifications is that each module can be specified and provenindependently, instead of having to prove the entire system as a singlelarge entity. Another advantage is that the system can be incrementallyverified by composing a new module with a previously verifiedcomposition of modules.

To realize these advantages, a systemmust be decomposable into well-defined, relatively simple modules. Theinterfaces between the modules must also be well defined so that theeffect of the behavior of one module on the behavior of another can becompletely and formally specified.

Figure10.3.

Our system model consists of three typesof basic components: tasks, RTOS services, and links, with the linksproviding communication between other system components. The RTOSservices are decomposed into independent modules that model differentbasic RTOS services, as illustrated in Figure10-3 above:

1) a scheduler models areal-timescheduling algorithm.

2) a synchronizer models thedependenciesamong tasks and hence both intra- and interprocessor communications.

3) an allocator models themechanism ofresource sharing among tasks.

In succeeding parts in this series, weassume that each processor can run just a single scheduling algorithm.In a multiprocessor platform, we will have a number of schedulersrepresenting the same number of processors, whereas synchronization andallocation may be represented by a single instance of each model.

One of the objectives of the frameworkis to provide the user with an easy way to describe an application as aset of tasks and the RTOS services, such as a scheduling algorithm. Wemodel a task such that it carries all the pertinent information, i.e.,computation time, period, deadline, and so on.

Figure10.4.

In Figure 10-4 above , t1 is declared as aperiodic task, perTask , with aperiod of 50 cycles, a BCET of 5 cycles,a WCET of 12 cycles, a deadline of 63 cycles, and an offset of 8cycles. t1 is then connectedto the clock and the two links: link1 andlink2 .

As a way to maintain composition, eachcomponent handles its relevant data independently of the others. Forexample, a task determines when it is ready to run and when it hasfinished.

In this way, the RTOS scheduler behavesin a reactive manner, scheduling tasks according to the data receivedfrom them. Thus, we can add as many tasks and schedulers as we desire.

The same is the case with thesynchronizer and the allocator models. They hold the informationregarding their services, i.e., which tasks depend on each other or,for the case of the allocator, what resources are needed by a giventask.

As illustrated in Figure 10-3 ,we use aglobal clock to measure time in terms of clock cycles, i.e., using anabstract time unit. This allows us to identify the time at which a taskis ready to be executed or when a task has finished its execution.

Part 2: UniprocessorSystems
Part 3: MultiprocessorSystems

This series of articles is based oncopyrighted material submitted by Jan Masden, Kashif Virk and MercuryJair Gonzalez to “MultiprocessorSystems-On-Chips,” edited by Wayne Wolf and Ahmed Amine Jerraya. Itis used with the permission of the publisher, Morgan Kaufmann, animprint of Elsevier. The book can be purchased on-line.

Ahmed Jerraya is researchdirector with CNRS and is currently managing research on multiprocessorsystem-on-chips at TIMA Laboratory in France. Wayne Wolf is currently the GeorgiaResearch Alliance Eminent Scholar holding the Rhesa “Ray” S. Farmer,Jr. Distinguished Chair in Embedded Computer Systems at Georgia Tech'sSchool of Electrical and Computer Engineering (ECE). Previously aprofessor of electrical engineering at Princeton University, he workedat AT&T Bell Laboratories.

JanMadsen is professor of computer basedsystems at the Department of Informatics and Mathematical Modeling atthe Technical University of Denmark where he heads the System-On-ChipDesign Group.

KashifVirk works for Jan Masden inpreparation for a doctorate. He has a B.Sc. in Electrical Engineeringfrom the University of Engineering and Technology, Lahore, Pakistan andan M.Sc. from the Technical University of Denmark.

Mercury JairGonzalez is a designengineer at the Semiconductor Technology Center of CINVESTAV in Mexico.His research interests are VLSI design methods, MP SoC design methods,and the timing analysis of real-time systems.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.