Multicore’s “fourth semaphore”: multiple reader – writer lock”
Users of real-time operating systems (RTOS) are familiar with the various kinds of semaphores they offer. Depending on which RTOS you choose, these may include counting semaphores, binary semaphores and mutex semaphores.
As the embedded software world transitions into the era of multi-core processing, these 3 “traditional” kinds of semaphores are being joined in a number of multi-core oriented operating systems by a fourth kind of semaphore: the multiple reader – writer lock. In this article, we will see how these new semaphores work, and which multi-core application problems are best solved using this special kind of semaphore.
The multiple readers-writers problem
A software design problem that is encountered more and more often when using multi-core processors, is the so-called “Multiple Readers – Writers” problem. This is a situation in which a large data area must be shared among multiple tasks. Clearly, when one task is writing into the data area, no other task should be writing into it at the same time.
And no other task should be reading from it at that time. This discipline could also be enforced by a “traditional” semaphore. But in the “Multiple Readers – Writers” problem, we would like to allow many tasks to read from the data area at the same time – at those times when no task is writing into it.
Traditional semaphores do not offer this flexibility, as they instead limit access to the data area to only one single reader task when there is no task writing. Thus, traditional semaphores are a less-than-optimal solution for the multiple readers – writer problem.
An application example is the management of an airline reservation seat map data base. Many different air travel customers may be interested in certain seats on a certain flight, and may ask to see the seat map of nearby seats at the same time. For instance, a customer in San Francisco, and another in Berlin, and another in Tel-Aviv may all at once inquire about the seats near seat 18D in Figure 1 below.
Figure 1: Airline reservation seat map example of “Multiple Readers – Writers” problem.
When all of those customers ask about the seat map, we would like to provide them with this information as quickly as possible – all at the same time, rather than one-after-another.
But if one of those customers actually decides to reserve a seat or two, we want to allow only that one customer to select that seat or two. We want to prevent different customers from attempting to reserve the same seat(s) at the same time. And we want to prevent reading of the seat map while the seat reservations are in the midst of being updated.
A “Multiple Reader – Writer” lock can do this with optimal efficiency, by allowing multiple simultaneous readers when there is no active writer, and no simultaneous access at all during active writing.
This “Multiple Reader – Writer” lock is our fourth kind of semaphore. It is becoming available in a number of multi-core oriented embedded operating systems. In particular, it can be found in Symmetric Multi-Processing (“SMP”) operating systems with origins either in the world of traditional single-CPU real-time operating systems (RTOS) or origins in the world of SMP Linux.
Using these operating systems, tasks (sometimes called ‘threads’ or ‘processes’) may dynamically move from core to core. These tasks may at the same time be interested in reading from shared data tables or data bases. Or a number of tasks may have a Producer-Consumer relationship for some shared information, with multiple consumers often interested in receiving information at the same time.
On a multi-core chip, different tasks may be running on different cores in true parallelism (beyond the pseudo-parallelism of multitasking within a single CPU). So the ability to provide them with the data they need in true parallel fashion can speed up performance -- when compared with the sequentialization of data reading that would be provided by a “traditional” semaphore.
Three kinds of “traditional” semaphores
Single-CPU oriented RTOSs typically offer the following 3 kinds of “traditional” semaphores: counting semaphores, binary semaphores and mutex semaphores. Each of them is targeted for a different purpose in our multitasking software designs.
The most sophisticated of the three is the counting semaphore. It can be thought of like a plastic cup containing a number of identical ‘tokens’. Tasks can put individual tokens into the cup, or take individual tokens out. Counting semaphores have often been used for regulating the access of tasks to multiple equivalent serially-shareable resources.
For instance, 17 tasks may wish to share 4 identical printers. In this case, a counting semaphore can be created and initialized with 4 tokens, as illustrated below. Tasks are then programmed to take a token before printing, and return the token after printing is done.
But such tasks run into a problem: After one or more tokens has been taken, the task will not be able to determine which printer(s) are still available. Thus, counting semaphores are only a very partial solution to the sharing of multiple equivalent resources. [1.]
A better way to use counting semaphores is for signaling from one task to another. For example, if 2 tasks of different priorities need to execute the same total number of times over the long run: A counting semaphore can be created with an initial count of zero (no ‘tokens’ in it).
Every time the higher priority task runs, it creates a token and puts it into the semaphore, thus incrementing the semaphore’s count. The lower priority task of the pair waits at the semaphore for tokens to appear, and runs once for each new token, thus consuming the token and decrementing the semaphore’s count. If the high priority task runs with moderate bursts, the lower priority task will eventually ‘catch up’ to the same total number of executions.
Classic binary semaphores can be thought of simply as counting semaphores that are limited to a maximum count of one (1 single token). They can also be used for signaling from task to task, in situations where signals (counts, tokens) will not accumulate or need not be counted.
Classic binary semaphores should be avoided for purposes of mutual exclusion. For example, they should not be used to regulate sharing of a single printer, or a shared data structure among multiple tasks. The reason is a famous (or should I say ‘infamous’) bug called an “unbounded priority inversion” that is intrinsic in classic binary semaphores. [2.]
Mutex semaphores can be thought of as binary semaphores that have had the “unbounded priority inversion” bug repaired. Hence, a mutex is the mechanism of choice for purposes of mutual exclusion. It’s the way to regulate sharing of a printer, or a shared data structure among multiple tasks.
Most mutexes solve the “unbounded priority inversion” bug by manipulating the priority of a task as it holds the mutex, using techniques such as ‘priority inheritance’ protocol or ‘priority ceiling’ protocol.
The Fourth Semaphore
All three kinds of “traditional” semaphores treat all tasks the same way, when they are used to regulate access to a single shared resource. They do not distinguish between tasks that wish to read from a shared resource, as against tasks that wish to write to the shared resource.
Only the fourth kind of semaphore, the “Multiple Reader – Writer” Lock, differentiates between reader tasks and writer tasks. And by doing so, it allows simultaneous parallel access to the shared resource by multiple reader tasks only.
Multiple Reader – Writer Locks are inherently complex. Inside of the operating systems that support them, each lock has associated with it a pair of task-waiting queues, as shown in Figure 2 below.
Figure 2: “Multiple Reader – Writer” Lock and associated Task Queues.
One task queue, shown to the right in Figure 2, is a queue in which writer tasks can be held by the operating system, to wait until reader tasks have finished reading.
A second task queue, shown to the left in Figure 2, is a queue in which reader tasks can be held by the operating system, to wait until a writer task (or perhaps several) have finished writing.
The management and control of these tasks queues is complex, often involving additional internal semaphores within the operating system logic itself. This complexity can sometimes impact performance, so these Locks should be used with care.
Multiple Reader – Writer Locks typically provide higher performance than other semaphores or locks when there are lots of concurrent reader tasks, but writing occurs infrequently. This is quite common in multi-core software designs.
Multiple Reader – Writer Locks typically do not provide a significant performance benefit when there is a high rate of writing by writer tasks. This is because writer tasks must run one-at-a-time, and only when no reader tasks are actively reading.
In this situation, “traditional” semaphores may well provide higher performance. In a single-CPU environment, an alternative to consider may be a mutex. In a multi-core environment, am alternative to consider might be a “spinlock”.
A generic fourth semaphore API
There is no standardization of the application programmer’s interface (API) to Multiple Reader – Writer Locks as implemented in the various operating systems that support them. So I’d like to show a “generic” API that illustrates the typical services offered, but is not identical to the API of any actual operating system:
* RWLockCreate() Create a New Multiple Reader – Writer Lock.
* RWLockDelete() Destroy an existing Multiple Reader – Writer Lock.
* RWLockRead() Lock the Lock for Reading. [Shareable.]
* RWLockEndRead() Un-lock, after the end of Reading.
* RWLockWrite() Lock the Lock for Writing. [Non-Shareable.]
* RWLockEndWrite() Un-lock, after the end of Writing.
Once such a lock has been created to protect a shareable data resource, tasks that wish to read from it must first call RWLockRead() . Later, those tasks must call RWLockEndRead() when they are finished reading. More than 1 task will be permitted to read at the same time.
Tasks that wish to write must first call RWLockWrite() . Later, those tasks must call RWLockEndWrite() when they are finished writing. Only 1 task at a time will be permitted to write, and this will only be when no task has permission to read.
Variations Of multiple reader – writer locks
Not all Multiple Reader – Writer Locks behave in precisely the same way. Different operating systems that offer them, implement them with differing internal logic that can express different application software design preferences. We will take a quick look at 3 possible differing approaches:
1. “Readers Preference” Lock. This is a Multiple Reader – Writer Lock that tries to maximize potential concurrency by giving clear preference to reader tasks. Whenever some tasks are already reading, any new readers will be allowed to join them.
A new reader task will only be held back (in the Readers Task Queue on the left side of Figure 2 above and Figure 3 below), if a writer task is already holding permission to write.
Figure 3: “Multiple Reader – Writer” Lock with Readers Preference.
Writer tasks will be held back (in the Writers Task Queue on the right side of Figures 2 and 3), so long as any reader task is still reading. Thus, writer tasks might be delayed for a long time. And in an extreme case, writer tasks might suffer starvation in the Writers Task Queue.
“Writers Preference” Lock. This is a Multiple Reader – Writer Lock that strives to allow writer tasks to update the protected data as soon as possible. But it still allows reader tasks, including possibly multiple parallel readers, to finish the reading they’ve already started, before a writer task is given access to the data.
Figure 4:: “Multiple Reader – Writer” Lock with Writers Preference.
New reader tasks will be held back (in the Readers Task Queue on the left side of Figures 2 and 4), as soon as a writer task has asked for permission to write. Thus, new reader tasks might be delayed for a long time.
In an extreme case, reader tasks might suffer starvation. This extreme case will be exceedingly rare if, as is typical when using Multiple Reader – Writer Locks, reading is quite frequent but writing is quite infrequent.
2. “Fair” Lock. This is a Multiple Reader – Writer Lock that will not allow either a reader task or a writer task to starve. It strives to be ‘fair’ to both readers and writers, but in doing so it might force some reader tasks or writer tasks to wait for longer times than the absolute minimum possible waiting time.
One way to achieve ‘fairness’ is for the Lock to service tasks on a “First Come, First Served” basis within the constraints of a Multiple Reader – Writer Lock. In other words, new reader tasks will be held back if there’s a writer task already waiting. And new writer tasks will be held back if there’s a reader task already waiting.
Of these 3 variants of Multiple Reader – Writer locks, the “Writers Preference” variant is probably the most common implementation in operating systems. It works well when writer tasks lock a Lock infrequently, and only for short periods of time.
Developers of multi-tasking software with an RTOS are familiar with 3 kinds of traditional semaphores: counting semaphores, binary semaphores and mutex semaphores. As embedded software enters the era of multi-core processing, these 3 are being joined in a number of multi-core oriented operating systems by a fourth kind of semaphore: the multiple reader – writer lock.
We have seen how these new semaphore locks ensure mutual exclusion when data writer tasks are involved, while allowing multiple reader tasks to access data at the same time.
These new semaphores often offer better performance than traditional kinds of semaphores when multiple tasks desire truly parallel read-access to shareable data. Such performance acceleration is of benefit in many producer-consumer applications, data base interactions, and multi-core software designs.
[1.] Counting semaphores. “Mutexes and Semaphores Demystified” by Michael Barr.
[2.] Unbounded priority inversions. “Mutexes Battle Priority Inversions” by David Kalinsky.
David Kalinsky of Kalinsky Associates is a teacher of intensive short courses on embedded systems and software development for professional engineers. One of his popular courses is “Design of Distributed and Multicore Systems & Software”. His courses are presented regularly in ‘open class’ format at technical training providers in international locations such as Munich, Singapore, Stockholm and Tel-Aviv, as well as in his ‘home market’ of the USA. http://www.kalinskyassociates.com/TrainingCourses.html