Effective use of Pthreads in embedded Linux designs: Part 1 – The multitasking paradigm - Embedded.com

Effective use of Pthreads in embedded Linux designs: Part 1 – The multitasking paradigm

The heavyweight “process model”, historically used by Unix systems, including Linux, to split a large system into smaller, more tractable pieces doesn’t always lend itself to embedded environments owing to substantial computational overhead. POSIX threads, also known as Pthreads, is a multithreading API that looks more like what embedded programmers are used to but runs in a Unix/Linux environment.

This technique has been employed successfully for at least the past quarter century to build highly responsive, robust computer systems that do everything from flying space shuttles to decoding satellite TV programs. While multitaskng operating systems have been common in the world of embedded computing, it is only fairly recently—within the past decade—that multitasking has made its way into the Unix/Linux world in the form of “threads” – the standard thread API known as POSIX 1003.1c, Posix Threads, or Pthreads for short.

From the perspective of an embedded systems developer familiar with off-the-shelf real-time operating systems, Linux appears to be unnecessarily complex. Much of this complexity derives from the protected memory environment in which Unix evolved. So we should start by reviewing the concept of Linux “processes”. Then we’ll see how threads differ, but at the same time, how the threads approach to multitasking is influenced by its Unix heritage.

As shown in Figure 1 , the basic structural element in Linux is a process consisting of executable code and a collection of resources like data, file descriptors and so on. These resources are fully protected such that one process can’t directly access the resources of another. In order for two processes to communicate with each other, they must use the inter-process communication mechanisms defined by Linux such as shared memory regions or pipes.

Figure 1: Processes vs. Threads

This is all well and good as it establishes a high degree of protection in the system. An errant process will most likely be detected by the operating system and thrown out before it can do any damage. But there’s a price to be paid in terms of excessive overhead in creating processes and using the inter-process communication mechanisms.

Protected memory systems are divided into User Space and Kernel Space. Normal applications execute as processes in fully protected User Space. The operating system kernel executes in Kernel Space. This means that every time a kernel service is called, read() or write() for example, the system must jump through some hoops to switch from User Space to Kernel Space and back again. Among other things, data buffers must be copied between the two spaces.

A thread on the other hand is code only. Well, ok it’s code and a context, which is for all practical purposes a stack that can store the state of the thread when it isn’t executing. Threads only exist within the context of a process and all threads in one process share its resources. Thus all threads have equal access to data memory and file descrip- tors. This model is sometimes called lightweight multi-tasking to distinguish it from the UNIX process model.

The advantage of lightweight tasking is that intertask communication is more efficient. The drawback of course is that any task can clobber any other task’s data.

Historically, most off-the-shelf multitasking real time operating systems, such as VRTX and VxWorks, have used the light- weight multitasking model. Recently, as the cost of processors with memory protection hardware has plummeted, and the need for reliability has increased, many vendors have introduced protected mode versions of their operating systems.

The Interrupt. Let us digress for a moment to consider the essence of the asynchronous programming paradigm. In real life, “events” often occur asynchronously while you’re engaged in some other activity. The alarm goes off while you’re sleeping. The boss rushes into your office with some emergency while you’re deep in thought on a coding problem. A telemarketer calls to sell you insurance while you’re eating dinner.

In all these cases you are “interrupted” from what you were doing and are forced to respond. How you respond de- pends on the nature and source of the interrupt. You chew out the telemarketer, slam the phone down and go back to eating dinner. The interruption is short if nonetheless irritating. You stop your coding to listen to the boss’s per- ceived emergency. You may have to drop what you’re doing and go do something else. When the alarm goes off, there’s no question you’re going to do something else. It’s time to get up.

In computer terms, an interrupt is the processor’s response to the occurrence of an event. . The event “interrupts” the current flow of instruction execution invoking another stream of instructions that services the event. When serv- icing is complete, control normally returns to where the original instruction stream was interrupted. But under su- pervision of the operating system, control may switch to some other activity.

Interrupts are the basis of high performance, highly responsive computer systems. Perhaps not surprisingly they are also the cause of most of the problems in real-time programming.

Consider a pair of threads acting in a simple producer/consumer paradigm (Figure 2 ). Thread 1 produces data that it writes to a global data structure. Thread 2 consumes the data from the same structure. Thread 1 has higher priority than Thread 2 and we’ll assume that this is a preemptive system. (Set aside for the moment that I haven’t defined what preemptive is. It should become clear.)

Figure 2: Interrupts cause problems

Thread 1 produces data in response to some event, i.e. data is available from the source, an A/D converter perhaps. The event is signaled by an interrupt. The interrupt may be from the A/D converter saying that it has finished a conversion, or it may sim- ply be the timer tick saying that the specified time interval has elapsed. This leads to some problems.

The problem should be relatively obvious. The event signaling “data ready” may occur while Thread 2 is in the middle of reading the data structure. Thread 1 preempts Thread 2 and writes new data into the structure. When Thread 2 resumes, it finds inconsistent data.

The Multitasking paradigm. This then is the essence of the real-time programming problem; managing asynchronous events such that they can be serviced only when it is safe to do so.

Fundamentally, multitasking is a paradigm for safely and reliably handling asynchronous events. Beyond that it is also a useful way to break a large problem down into a set of much smaller, more tractable problems that may be treated independently. Each part of the problem is implemented as a thread. Each thread does one thing to keep it simple. Then we pretend that all the threads run in parallel.

To reiterate, threads are the way in which multitasking is implemented in Unix-like systems. With this background we can now begin exploring the world of Posix threads.

The Threads API
Creating a Thread. The mechanism for creating and managing a thread is analogous to creating and managing a process, as follows:

   int pthread_create (pthread_t *thread, pthread_attr_t *attr, void *(*start_ routine) (void *), void *arg);
   void pthread_exit (void *retval);
   int pthread_join (pthread_t thread, void **thread_return);
   pthread_t pthread_self (void);
   int sched_yield (void);

The pthread_create() function is like fork() except that the new thread doesn’t return from pthread_create() but rather begins execution at start_routine() , which takes one void * argument and returns void * as its value. The arguments to pthread_create() are:

  • pthread_t – A thread object that represents or identifies the thread. pthread_create() initializes this as necessary..
  • Pointer to a thread attribute object. More on that later.
  • Pointer to the start routine.
  • Argument to be passed to the start routine when it is called.

A thread may terminate by calling pthread_exit() . The argument to pthread_exit() is the start routine’s return value.

In much the same way that a parent process can wait for a child to complete by calling waitpid() , a thread can wait for another thread to complete by calling pthread_join() . The arguments to pthread_join() are the ID of the thread to wait on and a place to store the thread’sreturn value. The calling thread is blocked until the target threadterminates.

A thread can determine its own ID by calling pthread_self() . Finally, a thread can voluntarily yield the processor by calling sched_yield() .

Note that most of the functions above return an int value. This reflects the threads approach to error handling. Rather than reporting errors in the global variable errno , threads functions report errors through their return value. This is because errno is global and visible to all threads. This makes it susceptible to thesame kind of preemption problem we saw earlier in discussing interrupts.

Figure 3 is a simple example of creating a thread. Tomake it simpler, error return status has been ignored. This is a threadversion of the traditional Hello World program.

Figure 3: Sample Thread Program

All thread programs must include the header file pthread.h . This program creates a thread, passing as the argument the first element of argv passed to main . After telling us it created a thread, the main function waits for the thread to terminate and then outputs the returned value.

Thethread simply prints its argument as a string and returns the argumentas its value. Note that a thread may terminate by simply returningrather than calling pthread_exit() .

Figure 4 illustrates the life cycle of a thread as represented by a state machine. A thread is “born” by the pthread_create() function, which places it in the ready state. A thread runs when it isscheduled. The running thread may be blocked because it must wait forsome resource, or it may be preempted either because a higher prioritythread is ready to run or its timeslice has expired.

Figure 4: Thread State Machine

Whenthe resource becomes available a blocked thread transitions to theReady state and will eventually be scheduled to run again. Finally, athread is terminated when it is done or another thread requests itscancellation.

Thread Termination. A thread may beterminated either voluntarily or involuntarily. A thread terminatesitself either by simply returning or by calling pthread_exit() . In the latter case, all cleanup handlers that the thread registered by calls to pthread_cleanup_push() are called prior to termination.

Athread may be involuntarily terminated if another thread cancelsit. The cleanup handlers are also called in this case. We’ll return tothe notion of cleanup handlers and thread cancellation later.

Threadshave an attribute called detach state. The detach state determineswhether or not a thread can be joined when it terminates. The defaultdetach state is PTHREAD_CREATE_JOINABLE , meaning that the thread can be joined on termination. The alternative is PTHREAD_CREATE_DETACHED , which means the thread can’t be joined.

Joiningis useful for two reasons: either you need the thread’s return value,or you need to be sure the thread has terminated beforeproceeding. Otherwise it’s better to create the thread detached. Theresources of a joinable thread can’t be recovered until another threadjoins it whereas a detached thread’s resources can be recovered as soonas it terminates. Most multitasking kernels have no concept of ajoinable task. All tasks are detached.

Attribute Objects. POSIX provides an open-ended mechanism for extending the API throughthe use of attribute objects. For each pthread object there is acorresponding attribute object. This attribute object is effectively anextended argument list to the related object create function. A pointerto an attribute object is always the second argument to a createfunction. If this argument is NULL the create function uses appropriate default values.

Animportant philosophical point is that all pthread objects areconsidered to be “opaque”. This means that you never directly accessmembers of the object itself. All access is through API functions thatget and set the member fields of the object.

This allows newarguments to be added to a pthread object by simply defining acorresponding pair of get and set functions for the API.

Here is part of the attribute API for thread objects:

   int pthread_attr_init (pthread_attr_t *attr);
   int pthread_attr_destroy (pthread_attr_t *attr);
   int pthread_attr_getdetachstate (pthread_attr_t *attr, int *detachstate);
   int pthread_attr_setdetachstate (pthread_attr_t *attr, int detachstate);

Beforeit can be used, an attribute object must be initialized. Then any ofthe attributes defined for that object may be set or retrieved with theappropriate functions. This must be done before the attribute object isused in a call to pthread_create() . Ifnecessary, an attribute object can also be “destroyed”. Note that asingle attribute object can be used in the creation of multiple threads.

For threads there is only one required attribute, the detach state that we met earlier.

Thread Scheduling Policies
Thereare several optional thread attributes that have to do with schedulingpolicy, that is, how threads are scheduled relative to one another.These attributes and the strategies they represent are only present ifthe constant _POSIX_THREAD_PRIORITY_SCHEDULING is defined. Without this constant, Pthreads is notreal-time and simply falls back on the default Linux scheduling policy.

A threads implementation that defines _POSIX_THREAD_PRIORITY_SCHEDULING must also define a structure, sched_param , with at least one member, sched_priority ,an integer. Interestingly, here is one place where a structure memberis made explicitly visible. There is not a pair of functions for settingand getting sched_priority. You simply read and write the structuremember directly. Note that sched_param is declared with a struct and notas a typedef.

Here is the API for thread scheduling attributes:

   int pthread_attr_setschedparam (pthread_attr_t *attr, const struct sched_param *param);
   int pthread_attr_getschedparam (const pthread_attr_t *attr, struct sched_param *param);
   int pthread_attr_setschedpolicy (pthread_attr_t *attr, int policy);
   int pthread_attr_getschedpolicy (const pthread_attr_t *attr, int *policy);
   int pthread_attr_setinheritsched (pthread_attr_t *attr, int inherit);
   int pthread_attr_getinheritsched (const pthread_attr_t *attr, int *inherit);

   int pthread_setschedparam (pthread_t pthread, int *policy, const struct sched_param *param);
   int pthread_getschedparam (pthread_t pthread, int *policy, struct sched_param *param);
   int sched_get_priority_max (int policy);
   int sched_get_priority_min (int policy);

Pthreads defines two real-time scheduling policies that can be applied on a per-thread basis:

  • SCHED_FIFO – A thread runs until another thread of higher priority becomes ready or until it voluntarily blocks. When a thread with SCHED_FIFO scheduling policy becomes ready, it runs immediately if its priority is higher than that of the running thread.
  • SCHED_RR – Much like SCHED_FIFO with the addition that a SCHED_RR thread can be preempted by another SCHED_FIFO or SCHED_RR thread of the same or higher priority after a specified time interval or timeslice.

InLinux, the real-time scheduling policies may only be set by processeswith superuser privileges. When you wish to explicitly set thescheduling policy or parameters of a thread, you must also set theinheritsched attribute. By default inheritsched is set to PTHREAD_INHERIT_SCHED ,meaning that newly created threads will inherit the scheduling policiesof the thread that created them. To be able to change schedulingpolicies you set inheritsched to PTHREAD_EXPLICIT_SCHED.

When threads with SCHED_FIFO or SCHED_RR scheduling policies block waiting for a resource, they wait up inpriority order. That is, if several threads are waiting on a resource,the one with the highest priority is scheduled when the resource becomesavailable. If several threads of the same priority are waiting, the onethat has been wait- ing longest will be scheduled (FIFO order).

You can get the minimum and maximum priority values for each of the real-time scheduling policies through the functions sched_get_priority_min() and sched_get_priority_max() . You can also get and set the scheduling policies of a running thread through pthread_getschedparam() and pthread_setschedparam() .

Finally, Pthreads defines another scheduling policy, SCHED_OTHER ,which is “not defined.” In practice, this becomes the default Linuxscheduling policy. In most cases it is a “fairness” algorithm thatgradually raises the priority of low priority waiting threads until theyrun.

Read Part 2: Sharing resources

Doug Abbott is the principal of Intellimetrix , a consulting firm specializing in hardware and software for industrialand scientific data acquisition and embedded product applications. He’salso a popular instructor and seminar leader, teaching classes on Linuxand real-time programming in general. Doug has taught the techniques ofembedded programming and multitasking operating systems to hundreds ofprofessional engineers. This article is based on a paper he presented aspart of a class at the Embedded Systems Conference on “Introduction toPthreads: Asynchrnous programming in the Unix/Linux Environment(ESC-308).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.