By Shameem Akhter and Jason Roberts, Intel Corp.
Proper use of lock to avoid race conditions can invite performance
problems if the lock becomes highly contended. The lock becomes like a
tollgate on a highway. If cars arrive at the tollgate faster than the
toll taker can process them, the cars will queue up in a traffic jam
behind the tollgate.
Similarly, if threads try to acquire a lock faster than the rate at
which a thread can execute the corresponding critical section, then
program performance will suffer as threads will form a "convoy" waiting
to acquire the lock. Indeed, this behavior is sometimes referred to as
convoying.
As mentioned in the discussion of time-slicing woes, convoying
becomes even worse for fair locks, because if a thread falls asleep,
all threads behind it have to wait for it to wake up. Imagine that
software threads are cars and hardware threads are the drivers in those
cars. This might seem like a backwards analogy, but from a car's
perspective, people exist solely to move cars between parking places.
If the cars form a convoy, and a driver leaves his or her car, everyone
else behind is stuck.
Priority Inversion
Some threading implementations allow threads to have priorities. When
there are not enough hardware threads to run all software threads, the
higher priority software threads get preference.
For example, foreground tasks might be running with higher
priorities than background tasks. Priorities can be useful, but
paradoxically, can lead to situations where a low-priority thread
blocks a high-priority thread from running.
Figure 7.7 below illustrates
priority inversion. Continuing our analogy with software threads as
cars and hardware threads as drivers, three cars are shown, but there
is only a single driver. A low-priority car has acquired a lock so it
can cross a single-lane "critical section" bridge.
Behind it waits a high-priority car. But because the high-priority
car is blocked, the driver is attending the highest-priority runnable
car, which is the medium-priority one. As contrived as this sounds, it
actually happened on the NASA Mars Pathfinder mission.
 |
| Figure
7.7. Priority Inversion Scenario, Where High Priority Gets
Blocked and Medium Priority Gets the Cycles |
In real life, the problem in Figure
7.7 would be solved by bumping up the priority of the blocking
car until it is out of the way. With locks, this is called priority
inheritance.
When a high-priority thread needs to acquire a lock held
by a low-priority thread, the scheduler bumps up the priority of the
blocking thread until the lock is released. Indeed, the Mars Pathfinder
problem was solved by turning on priority inheritance.
An alternative is priority ceilings in which a priority, called the
ceiling, is assigned to the mutex. The ceiling is the highest priority
of any thread that is expected to hold the mutex.
When a thread
acquires the mutex, its priority is immediately bumped up to the
ceiling value for the duration that it holds the mutex. The priority
ceilings scheme is eager to bump up a thread's priority. In contrast,
the priority inheritance scheme is lazy by not bumping up a thread's
priority unless necessary.
Windows mutexes support priority inheritance by default. Pthreads
mutexes support neither the priority inheritance nor priority ceiling
protocols. Both protocols are optional in the pthreads standard. If
they exist in a particular implementation, they can be set for a mutex
via the function pthread_mutexattr_setprotocol
and inspected with the function pthread_mutexattr_getprotocol.
Programmers "rolling their own" locks or busy waits may encounter
priority inversion if threads with different priorities are allowed to
acquire the same lock. Hand-coded spin locks are a common example. If
neither priority inheritance nor priority ceilings can be built into
the lock or busy wait, then it is probably best to restrict the lock's
contenders to threads with the same priority.