Threading and Parallel Programming Constructs used in multicore systems development: Part 1
By Shameem Akhter and Jason Roberts, Intel Corp.
The theory and practice of the principal parallel programming
constructs that focus on threading is important in multicore software
development and begins with the fundamental concepts of
synchronization, critical section, and deadlock.
In this series of three articles we will deal with implementation
details and related issues. In this first part we will intoduce
synchronization and critical sections. In Part 2, we will deal with
synchronization primitives
including semaphores, locks and condition
variables. In Part3 we discuss messages,
flow control-based concepts
such as fence and barrier and implementation-dependent threading
features.
Synchronization
Synchronization is an enforcing mechanism used to impose constraints on
the order of execution of threads. The synchronization controls the
relative order of thread execution and resolves any conflict among
threads that might produce unwanted behavior. In simple terms,
synchronization is used to coordinate thread execution and manage
shared data.
In an environment where messages are used for communicating between
a sender and a receiver, synchronization is implicit, as a message must
be sent before the message can be received.
On the other hand, for a shared-memory based environment, threads
have no implicit interdependency unless some constraints are imposed.
Two types of synchronization operations are widely used: mutual
exclusion and condition synchronization.
In the case of mutual exclusion,
one thread blocks a critical
section - a section of code that contains shared data - and one or more
threads wait to get their turn to enter into the section.
This helps when two or more threads share the same memory space and
run simultaneously. The mutual exclusion is controlled by a scheduler
and depends on the granularity of the scheduler.
Condition synchronization, on the other hand, blocks a thread until
the system state specifies some specific conditions. The condition
synchronization allows a thread to wait until a specific condition is
reached. Figure 4.1 below
shows the generic representation of synchronization.
 |
| Figure
4.1. Generic Representation of Synchronization Block inside Source Code |
While a number of techniques are available for synchronization, only
a few methods are used by developers on a regular basis. The techniques
used are also to some extent determined by the programming environment.
The scope of synchronization is broad. Proper synchronization orders
the updates to data and provides an expected outcome. As shown in Figure 4.2 below, shared data d can get access by threads Ti and Tj at time ti, tj, tk, tl, where

and a proper synchronization maintains the order to update d at
these instances and considers the state of d as a synchronization function of
time.
This synchronization function, s,
represents the behavior of a synchronized construct with respect to the
execution time of a thread.
 |
| Figure
4.2. Shared Data Synchronization, Where Data d Is Protected by a
Synchronization Operation |
Figure 4.3 below represents
how synchronization operations are performed in an actual
multi-threaded implementation in a generic form, and demonstrates the
flow of threads.
When m >=1, the creation
timing for initial threads T1Tm might not be the same. After block Bi as well as Bj, the number of threads
could be different, which means m is not necessarily equal to n and n
is not necessarily equal to p. For all operational environments, the
values of m, n, and p are at least 1.
 |
| Figure
4.3. Operational Flow of Threads for an Application |
Critical Sections
A section of a code block called a critical section is where shared
dependency variables reside and those shared variables have dependency
among multiple threads. Different synchronization primitives are used
to keep critical sections safe.
With the use of proper synchronization techniques, only one thread
is allowed access to a critical section at any one instance. The major
challenge of threaded programming is to implement critical sections in
such a way that multiple threads perform mutually exclusive operations
for critical sections and do not use critical sections simultaneously.
Critical sections can also be referred to as synchronization blocks.
Depending upon the way critical sections are being used, the size of a
critical section is important. Minimize the size of critical sections
when practical.
Larger critical sections-based code blocks should split into
multiple code blocks. This is especially important in code that is
likely to experience significant thread contention. Each critical
section has an entry and an exit point. A critical section can be
represented as shown in Figure 4.4
below.
 |
| Figure
4.4. Implement Critical Section in Source Code |
Deadlock
Deadlock occurs whenever a thread is blocked waiting on a resource of
another thread that will never become available. According to the
circumstances, different deadlocks can occur: self-deadlock, recursive
deadlock, and lock-ordering deadlock. In most instances, deadlock means
lock-ordering deadlock.
The self-deadlock is the instance or condition when a thread, Ti, wants to acquire a
lock that is already owned by thread Ti.
In Figure 4.5 (a) below, at
time ta thread Ti
owns lock li, where li is going to get
released at tc. However, there is a call at tb from Ti, which requires li. The release time of li is td, where td can be either before
or after tc. In this
scenario, thread Ti is
in self-deadlock condition at tb.
When the wakeup path of thread Ti,
resides in another thread, Tj,
that condition is referred to as recursive deadlock, as shown in Figure 4.5 (b) below. Figure 4.5 (c) below illustrates a
lock-ordering thread, where thread Ti
locks resource rj
and waits for resource ri,
which is being locked by thread Tj.
Also, thread Tj
locks resource ri and waits
for resource rj,
which is being locked by thread Ti.
Here, both threads Ti
and Tj are in
deadlock at td, and w is the wait-function for a lock.
 |
| Figure
4.5. Deadlock Scenarios |
To represent the transition model of deadlock of an environment,
consider representing atomic states by si and each thread of the system
by Ti. Each thread
can transition from one state to another by requesting a resource,
acquiring a resource, or freeing the current resource.
So, the transition can be represented as shown in Figure 4.6 below, where,

 |
| Figure
4.6. Deadlock Scenario in a State Transition for a Thread |
For any thread Ti,
if the state transition of Ti
becomes sd for all
possible scenarios and remains blocked at sd, thread Ti would not have any way
to transition from sd
to any other state. That is why state sd
is called the deadlock state for thread Ti.
Avoiding deadlock is one of the challenges of multi-threaded
programming. There must not be any possibility of deadlock in an
application. A lock-holding prevention mechanism or the creation of
lock hierarchy can remove a deadlock scenario. One recommendation is to
use only the appropriate number of locks
when implementing synchronization. .
Next in Part 2: Synchronization
Primitives
This
series of three articles was excerpted from Multi-Core
Programming by
Shameem Akhter and Jason Roberts, published by Intel Press. Copyright
© 2006 Intel Corporation. All rights reserved. This book can be
purchased on line.
Shameem Akhter is a platform
architect at Intel Corp., focusing on single
socket multicore architecture and performance analysis. Jason Roberts
is a senior software engineer at Intel, working on a number of
different multithreaded software products ranging from desktop,
handheld and embedded DSP platforms.