Modifying any part of the Linux kernel source code is usually a challenging task most software developers would prefer to avoid, composed as it is thousands of code lines divided by hundred of files.
In this article we describe some of the techniques we have used to simplify this job during the implementation of a new scheduling policy for Linux 2.6.24 kernel version, based on the well known real-time earliest deadline first(EDF) scheduling algorithm. (Source code for the scheduling policy implementation described in this article can be downloaded from sourceforge.net/
The introduction of scheduling classes in the Linux 2.6.23 kernel version has, of course, made the core scheduler quite extensible. The scheduling classes encapsulate scheduling policies and are implemented as modules . Then, the kernel consists of a core scheduler and various modules. These modules are hierarchically organized by priority and the scheduler dispatcher looks for a runnable task of each module in a decreasing priority order.
By in large, with the introduction of scheduling classes, implementing a new scheduling policy for Linux kernel is a much easier task. Nevertheless, there are problems in the small details, and in determining which parts of the kernel code are necessary to “touch” and which are not.
Our purpose here is to neither describe the Linux scheduler itself nor the Linux kernel. For that you must use these two references:  . Here our focus will be on the implementation details we employed.
Our SCHED_CASIO 1 Linux Scheduler (SCLS) is the platform we used to implement the EDF scheduling algorithm by modifying the general purpose Linux 2.6.24 kernel version. (The CASIO name comes from the name of a discipline called Conceitos Avancados de Sistemas Operativos (CASIO) of the Master course in Informatics Engineering of the Polytechnic Institute of Porto .
Scheduling in Real-Time Systems
Real-time systems are defined as those systems in which the correctness of the system depends not only on the logical result of computation, but also on the time at which the results are produced . That is, the computation results must be delivered within a time bound. This time bound is referred to as deadline. Typically, a real-time system consists of a controlling (computer) and a controlled (environment) systems.
The controlling system interacts with its environment based on information available about the environment. So, the controlling system receives periodically information about the environment through sensor devices and acts on the environment through the actuator devices.
A real-time application is normally composed of multiple tasks with different levels of criticality. Hard real-time tasks cannot miss any deadline, otherwise, undesirable or fatal results would occur, while soft real-time tasks can miss some deadlines and the system can still work correctly.
For example, heart pacemaker is a hard real-time system because a delayed signal may cause the death of person whereas a live audio-video system is, usually, categorized as soft real-time system because missing a deadline results in degraded quality to the user, but the system can continue operating.
Real-time Task Parameters
Before we get into the details of the scheduler implementation it is important to understand some of the more important real-time task parameters. A real-time task (τi ) is characterized by the following parameters: Release time (or ready time) (Ri ), Worst case execution time (Ci ), Deadline (DJ and Periodicity (Tii ).
Ri is the time at which the task is ready for processing. Ci is the processor time required for executing the task without interruption. Di is the time at which a task should be completed to avoid damage to the system. Ti is the time interval at which tasks are released in the system.
As a consequence of the periodicity, a real-time task Ti generates an infinite sequence of jobs (Jij ), where j is used to identified the jth instance of task τi . According to the periodicity, there are basically three kinds of tasks: periodic, sporadic and aperiodic.
Periodic tasks are those that are released regularly at fixed periods. Sporadic tasks are activated irregularly but its minimum inter-arrival period is known, which is, a minimum interval between two consecutive job releases. Aperiodic tasks are activated with an unknown periodicity.
Aperiodic tasks are behind the scope of this document and the sporadic tasks can be assumed as a subset of the periodic tasks. Since, the minimum inter-arrival period can be assumed as their period.
For a given set of jobs the main goal of a scheduling algorithm is to determine an execution order according to which the requirements of each job are satisfied .
In a real-time system the main purpose of the scheduling algorithm is to complete the execution of all jobs before their deadlines. There are several scheduling algorithms for real-time tasks. However, the most used and studied are: Rate Monotonic  and Earliest Deadline First .
Rate Monotonic (RM) assigns priorities statically according to the periodicity. The shorter period the higher is the task's priority. The Earliest Deadline First (EDF) is a dynamic scheduling algorithm that assigns the highest priority to the job with the earliest deadline.
This is an optimal scheduling algorithm on preemptive uniprocessors. The EDF algorithm can achieve an utilization of 100% if the task set presents periods equal to the deadlines (Ti = Di) for all tasks. The utilization of a system is computed as follows: U = ∑n i=1 x (Ci /Ti ).
Let us consider the following task set example (Table 1 below ), which utilization of the task set is 72.2%. Note that, in the last column of the Table 1 the offset (Oi ) of the first job of each task is specified. That is, the time instant at which the first job is released.
Table 1. Task set
Figure 1 below shows the timeline execution for the first job (only) of each task of the task set presented in Table l. The execution of the jobs is represented by gray rectangles and a black circle states the end of execution of a job.
As we can see, the priority assignment is done according to the absolute deadline (which is the sum of Ri + Di ) . That is, at a specific time instant the job that is executing is the one with the earliest deadline of all active jobs.
Figure l. Timeline execution according to the EDF algorithm
SCHED CASIO Linux Scheduler
The SCHED_CASIO Linux Scheduler (SCLS) implementation consist on a set of modifications to the Linux 2.6.24 kernel version to support real-time tasks scheduled according to the EDF scheduling algorithm.
To differentiate these tasks from other tasks present in the system, in this document, we refer to these tasks as CASIO tasks or CASIO jobs. Note that CASIO tasks are periodic tasks and are always present in the system. We also assume that, a CASIO tasks code structure must be similar to the algorithm presented in Listing 1 below .
Listing 1. CASIO task algorithm
SCHED CASIO_POLICY Configuration Option
The first step is to add a new configuration option entry that is used to wrap all the SCLS required code. Since the host system is based on x86 architecture, this configuration option entry must be added to the /kernel_source_code/arch/x86/kconfig file.
Listing 2 below shows the content of the new configuration option entry. A clarifying name (SCHED_CASIO_POLICY) defines what this option entry represents. However, this configuration option entry is referred in the code with CONFIG_SCHED_CASIO_POLICY. The CONFIG_ prefix is assumed but is not written.
The bool directive states that this option entry is a feature and can only be set using two values (y or n). The quoted text following the directive provides the name of this option in the various configuration utilities, like make menuconfig. The default value of this option is defined using default directive.
Listing 2. SCHEDCASIO POLICY configuration option entry
SCHED CASIO macro
The second step is to define a macro to identify the scheduling policy. For that, we have to change /kernel-source- code/include/linux/sched.h and /usr/include/bits/sched.h files in order to define this macro (Listing 3 below ).
Note that, the /usr/include/bits/sched.h file is outside of the kernel code, then the CONFIG_SCHED_CASIOPOLICY is unknown and consequently must be commented or removed.
Listing 3. Definition of the SCHED_CASIO macro
struct task_struct and struct rq are two central data structures in the system. In this section the changes made to these two data structures are explained in detail.
A Linux process is an instance of a program in execution . To manage processes, the kernel maintains information about each process in a process descriptor.
The information stored in each process descriptor (struct task_struct , defined in /kernel source-code/include/ linux/sched.h ) concerns with the run-state of a process, its address space, the list of open files, the process priority and its scheduling class, just to mention some.
All process descriptors are stored in a circular doubly-linked list. Note that, in the context of this document, the meaning of a process or a task is the same. In order to deal with CASIO tasks some fields must be added to this data structure.
Listing 4 below shows the fields added. Field casio_id is used to set the logical identifier of a CASIO task. The relative deadline (using nanosecond time unit) of each CASIO task is set on the deadline.
Listing 4. Fields added to the struct task_struct data structure
Each processor holds a run-queue of all runnable processes assigned to it. The scheduling policy uses this run-queue to select the “best” process to be executed. The information for these processes is stored in a per-processor data structure called struct rq, which is declared in the /kernel_source_code/kernel/sched.c .
The information about each CASIO task is stored using the struct casio task data structure (Listing 5 below ). Thus, task field is a pointer to the process descriptor entry. The absolute deadline is stored on the absolute-deadline field.
A data type struct rb node field is required for organizing CASIO tasks on a red-black tree (casio node). The Linux kernel has already implemented red-black tree (/kernel-source- code/include/linux/rbtree.h ).
Basically, red-black trees are balanced binary trees whose nodes are sorted by a key, consequently, most operations are done in O(log(n) ) time, thus a red-black tree is adequate for situations where nodes come and go frequently. In the SCLS implementation, the key for the red-black tree is the absolute deadline.
As we can see in the Listing 5 below, a struct list head casio_list_node field is defined. This field is required to organize all CASIO tasks present on the system in a double linked list.
The struct list-head data structure is defined in the /kernel_source_code/include/ linux/list.h file, which implements an easy-to use double linked list using C programming language.
All CASIO tasks assigned to one processor are managed using the struct casio_rq data structure. They are stored in a linked list, which list head is the casio list field.
The root of the red-black tree is the field casio task root. and nr_running field is used to specify the number of CASIO tasks on the run-queue. A data field of type struct casio_rq had to be added to the struct rq rq data structure.
Listing 5. Definition of CASIO specific data structures: struct casio task and struct casio_rq
The data structures of the scheduler and the run-queue are initialized in the sched_init function that is defined in /kernel_source_ code/kernel/sched.c file. Since new field was added to the struct rq rq , then there is the need to initialize this data field.
Listing 6. sched_init function
Listing 6 above shows the invocation of the init_casio_rq within the sched_init function, which initializes the casio_rq fields.
Next in Part 2: Building a new scheduling policy module.
1. Source code for the SCHED CASIO Linux Scheduler
1. A. Kumar. Multiprocessing with the completely fair scheduler. Technical report, IBM, 2008.
2. D. Bovet and M. Cesati. Understanding The Linux Kernel. O Reilly & Associates Inc, 2005.
3. Wolfgang Mauerer. ProfessionalLinux Kernel Architecture. Wiley Publishing, Inc., 2008.
4. C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM, 20(1):46-61, 1973.
5. John A. Stankovic. Misconceptions about real-time computing: A serious problem for next-generation systems. Computer, 21(10):10-19, 1988.
6. Arezou Mohammadi and Selim G. Akl. Technical report no. 2005-499 scheduling algorithms for real-time systems. 2005.
7 . C. L. Liu. Scheduling algorithms for hard-real-time multiprogramming of a single processor. JPL Space Programs Summary, II(1):37-60, 1969.
Luis Lino Ferreira is a researcher on QoS Architectures and algorithms for mobile distributed systems at CISTER Research Group and Professor at the Polytechnic Institute of Porto.)