Synchronization internals — the semaphore

This two-part series addresses the use and misuse of two of the most essential synchronization primitives in modern embedded systems, the mutex  (Part 1) and the semaphore (this article). Mutexes and semaphores are provided by most current RTOSs, although they can be implemented on bare-metal systems as well. Their use is considered a best practice among experienced firmware engineers, because they can make code more intuitive, leading to better maintainability, improved logic flow and higher productivity. If you’ve been using flags, variables and super loops to react to changes and control events in your code, skillful use of mutexes, semaphores and threads can improve your work.

Semaphores are analogous to the classic, but outdated, example of phone books. Suppose that you are in an office with just 10 phone books (and no Internet). If you want to place a call, you need a phone book to look up the number. The stack of phone books, not the phone books themselves, is the semaphore. If there are phone books available in the stack, you can “take” one. When you’re done with it, you “give” it back. If there are no phone books available, you must wait until someone returns one of them to the stack. This is how semaphores work, too. I like this example better than the usual example of bathrooms and keys, because the issue of which key works in which bathroom confuses the example.

Now that we’ve established at least one use-case for semaphores, recall that, unlike mutexes, semaphores incorporate a built-in counter (a signed integer) that is typically initialized to some value N, where N is usually ≥ 1. This is useful for representing a specific number of resources that are to be controlled/protected.

So, semaphores have two basic operations:

  1. take() – Adjusts the resource counter downward and, if necessary, waits until a resource becomes available. This operation has the potential to block the calling code if no resources are available.
  2. give() – Adjusts the resource counter upward and, if necessary, signals a waiting task that a resource has become available. This operation is typically non-blocking.

As with the mutex lock() and unlock() operations, the semaphore take() and give() operations should always happen in matched pairs (but they can be in either order – more on that later). Semantically speaking, take() could just as well be called wait or acquire or lock , and give() could be called signal or release or unlock .

The idea is that as long as a semaphore counter is non-negative (i.e., >0), there are still resources remaining to be take()-en. A resource could be anything that is shared – even a worker thread. Once the counter goes negative, however, tasks have to wait until at least one of the resources becomes available again.

Semaphores are generally used in two classes of problems:

  1. Coordinating access to multiple shared resources, across one or more processes.
  2. Signaling or triggering events between one or more processes.

The phone book example from above is a classic case of problem class #1. Class #2 is slightly less intuitive, but very, very useful for building event-driven systems, and it is our next topic.

Semaphores Are Also For Signaling
As described in Part 1, mutexes can be used to protect a single resource from two or more threads of execution. Semaphores differ in that they aren’t owned by a single process, and they feature a built-in counter so they’re ideal for protecting things that be counted, such as any kind of limited resource in a computer program. Semaphores, though, can also be used for signaling events! What kind of signals can a semaphore communicate? The kind that awaken other tasks, which turns out to be extremely useful.

Next page >

In the previous section we showed how a counting semaphore can beused to protect multiple resources. Now, let’s consider how a semaphorecan be used in another interesting way by simply reversing how we useits inherent properties of incrementing and decrementing a counter.Instead of initializing the semaphore with a positive, non-zero value,consider what happens if you initialize it with zero (0).

Bear in mind that the give() operation increments the counter andselects a waiting task to be resumed, and that the take() operationdecrements the counter and, if it’s negative, suspends the active task.By using these two operations in the reverse manner from the many resources solution described earlier, we find ourselves with a perfect method of notifying suspended tasks to start working at our command. This is ideal for reacting to triggered events, especially from within ISRs!

Instead of initializing a semaphore to a positive value, initializeit to zero. Any code that uses the take() operation will cause thesemaphore’s counter to go negative, which in turn causes the code toimmediately suspend itself. It won’t resume execution until anotherprocess give()s back the semaphore, which will cause the counter toreturn to zero and signal the scheduler to awaken the first process!This is a perfect technique for ping-ponging execution between two ormore tasks.

It’s even better than that, though, because the process that give()sthe semaphore never blocks. It can continue running as much as it likes,continually issuing signals to the receiving process as events happen.Using take(), a worker thread continually waits on the semaphore as asignal to do something. This is a classic producer-consumer situation! As one tasks produces in response to events, the consumertask awakens to perform work. The producer task could even be an ISRthat responds to an interrupt and places data in a queue before exiting.The consumer task would be blocked 99% of the time until data arrives,at which point it would gracefully awaken to process the queue, beforereturning to a suspended state. This is very CPU-efficient!

How a Semaphore Works
As with mutexes, eachsemaphore maintains a wait list to keep track of processes that areblocked on it in a suspended state (i.e., those that are waitingpatiently for their take() calls to return).

As a program consumes resources via the take() operation, thesemaphore counter is decremented until it reaches zero (0), meaning thatthere are no more resources to share. At that point, further calls totake() the semaphore will block, causing the active task to immediatelysuspend and add itself to the semaphore’s wait list. The process willremain suspended until some other task uses the give() operation toindicate that a resource is available once again (see below). The waitlist is usually a FIFO (first in, first out), but ifthe tasks have different priorities, the queue could be sorted by taskpriority or something else.

By contrast, the give() operation is used to indicate that a resourcehas become newly available, so the semaphore counter is incremented. Ifthe wait list contains at least one blocked task, it is removed fromthe wait list and marked by the scheduler to be eligible to resumeexecution, eventually causing that task’s take() call to return controlto the code.

By initializing a semaphore’s counter to a positive, non-zero valuethat represents the number of resources being protected, a semaphore canbe used to safely track the number of available resources. Tasks canuse the semaphore take() operation to decrease the number of remainingresources available, and, if none are available, automatically waituntil a new one becomes ready. As other actively running tasks finishtheir work with the shared resources, one-by-one their calls to thesemaphore give() operation cause suspended tasks to automatically resumeexecution. This is a solution to the classic many resources problem.

Producer-consumer problems are another class of problemsthat semaphores are widely used to address. Initializing a semaphore tozero allows take() to be used in a consumer thread that efficientlywaits on a producer thread that uses give() to signal events. This isalso the foundation of the immensely useful bounded queue software pattern.

As you can see, the initialization value of a counting semaphorecounter is critically important. It’s the key to coordinating all ofthis activity and, as such, is fundamental to a whole range of problemsin computer programming. Learning how to correctly use it adds importantcapabilities to your programming toolbox.

Mutual Exclusion Semaphore (aka “Binary” Semaphore)
Considerwhat happens when the counter in a counting semaphore is restricted tothe value 0 and 1. Obviously, because there are only two possible valuesin the range, you can see why it’s called a binary semaphore. But does abinary semaphore differ functionally, not just semantically, from amutex? The answer is yes.

Recall that a mutex has the property of being owned by a process,whereas a semaphore does not. A mutex lock can only be released by theprocess that locked it.

However, any process can unlock a binary semaphore acting as a mutex.This can lead to bugs that are difficult to detect if all of thelock/unlock sequences aren’t precisely handled. If a lock/unlockmismatch occurs with a mutex, the offending task will encounter an errorcode. However, because it’s legal for any task to “unlock” a binarysemaphore, such mismatches may not be detected.

In code, mutex operations always take the form of lock, criticalsection, unlock. Only one task at a time can execute the code in thecritical section. All other tasks will block if any other task isalready executing it. That’s completely different from a semaphore,which has the ability to awaken other tasks, and which can be signaled by any process that is aware of it.

A binary semaphore does not protect a resource from being accessed.The give() operation is totally decoupled from the take() operation, andcode that uses give() will not block. However, the take() operation is dependent on give(). In the case of a binary semaphore, that dependenceis particularly strong, because any task that uses it could be blockedif the semaphore’s value is zero. Similarly, any process can give() asemaphore and thus use it to signal whatever process is waiting on it,because there’s no owner. These properties make semaphores well-suitedto synchronization problems like producer-consumer.

System-Specific Implementation Concerns
Inpractice, mutex implementations are typically more lightweight thansemaphore implementations. At the lowest level, both are built on someform of test and set mechanism. Ideally, this useshardware features to atomically examine and assign a memory location in asingle CPU instruction. Typically, mutexes are used to synchronizeaccess to a resource for very short periods of time. Semaphores, aswe’ve shown, are usually more involved with the host RTOS because theycan be used to signal other processes.

On many systems, mutexes are equipped to deal with multithreading issues such as priority inversion , in which race conditions can permit a lower-priority task holding a resource lock to perpetuallyblock a higher-priority task. In such cases, the lower-priority tasknever gets a chance to execute, so the resource that it holds is neverreleased, which in turn causes the higher priority task to wait foreveron the resource to be released. It’s a no-win situation. To prevent suchthings from happening, the RTOS can temporarily raise the priority ofthe lower-priority task, so that it has a chance to run and release thecontested resource. However, on some systems, the semaphoreimplementation may or may not include such features. It’s important toread your system’s documentation to know about these issues, but thebest solution is to avoid the problem entirely by using good designhabits.

Another issue to consider is scope. On some systems, such as embeddedLinux, the scope of a mutex is contained within the process space,whereas semaphores can be used across process address spaces for interprocess synchronization .Most RTOS-equipped embedded systems use a single address space for allprocesses, though. In such systems, the distinction is not important.

Some systems may perform automatic clean-up in certain settings, too,such as releasing all mutexes when a task or process ends. However,best practice is to not rely on such features and to write code thatcleans up anything it allocates.

Conclusion
In this two-part series, we’ve coveredthe mutex and semaphore synchronization primitives, along with thebasic operation and theory of each.

We’ve seen that mutexes are used to restrict access to regions ofcode called critical sections and follow, operationally-speaking, theprocedure of lock, perform critical section, unlock. They are oftenefficiently implemented, and we’ve learned that they are owned by theprocesses that create them.

We’ve also seen that semaphores feature an integer counter and comein two major varieties, the binary semaphore where N=1, and the countingsemaphore where is N>1. Semaphores are not owned by any singleprocess and are used to address two major problem classes: (1)coordinating access to multiple shared resources, and (2) signalingevents between processes.

Semaphores support two major operations, take and give, which use thesemaphore’s counter to coordinate access to a defined set of multipleresources, as in the stack of phone books example. Such resources canalso include other tasks, such as a pool of worker threads. In addition,semaphores can be used to signal events to one or more tasks.

We learned that both mutexes and semaphores use wait lists to keeptrack of processes that are suspended in the RTOS. These lists are usedto awaken thread(s) that have invoked the mutex lock() operation or thesemaphore take() operation, and are pending use of the unlock() orgive() operations, respectively.

Because of their unique ability to awaken other suspended tasks,semaphores are useful in a variety of producer-consumer problems, whereproducer threads can signal events to worker threads. Both mutexes andsemaphores are tightly coupled with the host RTOS to efficiently suspendand resume processes with the CPU, so that busy-loops and unnecessarywait loops can be eliminated.

We saw that binary semaphores are similar to mutexes, but that thereare key differences. Notable among them being that any process can“unlock” a binary semaphore, not just the process that “locked” it.

Finally, I hope that you have learned how useful mutexes andsemaphores can be to your programs, both in firmware and in software,and on nearly any platform or RTOS. If you’ve not yet made the jump tousing a quality RTOS in your embedded applications, I encourage you toconsider how these tools can dramatically impact the productivity andreliability of your designs.

David Cecil holds degrees in electrical engineering and computer engineering from Georgia Tech, and is Director of Engineering at USA Firmwarein Cleveland, OH. He is an advocate for using C++, software patternsand other object-oriented techniques in embedded systems, and can bereached by email at David.Cecil@outlook.com.

1 thought on “Synchronization internals — the semaphore

  1. “Hi. Thanks for these articles. I would like to point out that the increment/decrement operations attributed to the semaphore give and take functions in page 1 of part 1 (take=increment, give=decrement) do not match what is stated in part 2 (take=decremen

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.