Threading and Parallel Programming Constructs used in multicore systems development: Part 1 - Embedded.com

Threading and Parallel Programming Constructs used in multicore systems development: Part 1

The theory and practice of the principal parallel programmingconstructs that focus on threading is important in multicore softwaredevelopment and begins with the fundamental concepts ofsynchronization, critical section, and deadlock.

In this series of three articles we will deal with implementationdetails and related issues. In this first part we will intoducesynchronization and critical sections. In Part 2, we will deal withsynchronization primitives including semaphores, locks and conditionvariables. In Part3 we discuss messages ,flow control -based conceptssuch as fence and barrier and implementation-dependent threadingfeatures.

Synchronization
Synchronization is an enforcing mechanism used to impose constraints onthe order of execution of threads. The synchronization controls therelative order of thread execution and resolves any conflict amongthreads that might produce unwanted behavior. In simple terms,synchronization is used to coordinate thread execution and manageshared data.

In an environment where messages are used for communicating betweena sender and a receiver, synchronization is implicit, as a message mustbe sent before the message can be received.

On the other hand, for a shared-memory based environment, threadshave no implicit interdependency unless some constraints are imposed.Two types of synchronization operations are widely used: mutualexclusion and condition synchronization.

In the case of mutual exclusion ,one thread blocks a criticalsection – a section of code that contains shared data – and one or morethreads wait to get their turn to enter into the section.

This helps when two or more threads share the same memory space andrun simultaneously. The mutual exclusion is controlled by a schedulerand depends on the granularity of the scheduler.

Condition synchronization, on the other hand, blocks a thread untilthe system state specifies some specific conditions. The conditionsynchronization allows a thread to wait until a specific condition isreached. Figure 4.1 below shows the generic representation of synchronization.

Figure4.1. Generic Representation of Synchronization Block inside Source Code

While a number of techniques are available for synchronization, onlya few methods are used by developers on a regular basis. The techniquesused are also to some extent determined by the programming environment.

The scope of synchronization is broad. Proper synchronization ordersthe updates to data and provides an expected outcome. As shown in Figure 4.2 below, shared data d can get access by threads Ti and Tj at time ti , tj , tk , tl , where

and a proper synchronization maintains the order to update d atthese instances and considers the state of d as a synchronization function oftime.

This synchronization function, s ,represents the behavior of a synchronized construct with respect to theexecution time of a thread.

Figure4.2. Shared Data Synchronization, Where Data d Is Protected by aSynchronization Operation

Figure 4.3 below representshow synchronization operations are performed in an actualmulti-threaded implementation in a generic form, and demonstrates theflow of threads.

When m >=1 , the creationtiming for initial threads T 1 T m might not be the same. After block Bi as well as Bj , the number of threadscould be different, which means m is not necessarily equal to n and nis not necessarily equal to p. For all operational environments, thevalues of m, n, and p are at least 1.

Figure4.3. Operational Flow of Threads for an Application

Critical Sections
A section of a code block called a critical section is where shareddependency variables reside and those shared variables have dependencyamong multiple threads. Different synchronization primitives are usedto keep critical sections safe.

With the use of proper synchronization techniques, only one threadis allowed access to a critical section at any one instance. The majorchallenge of threaded programming is to implement critical sections insuch a way that multiple threads perform mutually exclusive operationsfor critical sections and do not use critical sections simultaneously.

Critical sections can also be referred to as synchronization blocks.Depending upon the way critical sections are being used, the size of acritical section is important. Minimize the size of critical sectionswhen practical.

Larger critical sections-based code blocks should split intomultiple code blocks. This is especially important in code that islikely to experience significant thread contention. Each criticalsection has an entry and an exit point. A critical section can berepresented as shown in Figure 4.4below .

Figure4.4. Implement Critical Section in Source Code

Deadlock
Deadlock occurs whenever a thread is blocked waiting on a resource ofanother thread that will never become available. According to thecircumstances, different deadlocks can occur: self-deadlock, recursivedeadlock, and lock-ordering deadlock. In most instances, deadlock meanslock-ordering deadlock.

The self-deadlock is the instance or condition when a thread, Ti , wants to acquire alock that is already owned by thread Ti .

In Figure 4.5 (a) below, attime ta thread Ti owns lock li , where li is going to getreleased at tc. However, there is a call at tb from Ti , which requires li . The release time of li is td , where td can be either beforeor after tc . In thisscenario, thread Ti isin self-deadlock condition at tb .

When the wakeup path of thread Ti ,resides in another thread, Tj ,that condition is referred to as recursive deadlock, as shown in Figure 4.5 (b) below . Figure 4.5 (c) below illustrates alock-ordering thread, where thread Ti locks resource rj and waits for resource ri ,which is being locked by thread Tj .

Also, thread Tj locks resource ri and waitsfor resource rj ,which is being locked by thread Ti .Here, both threads Ti and Tj are indeadlock at td , and w is the wait-function for a lock.

Figure4.5. Deadlock Scenarios

To represent the transition model of deadlock of an environment,consider representing atomic states by si and each thread of the systemby Ti . Each threadcan transition from one state to another by requesting a resource,acquiring a resource, or freeing the current resource.

So, the transition can be represented as shown in Figure 4.6 below , where,

Figure4.6. Deadlock Scenario in a State Transition for a Thread

For any thread Ti ,if the state transition of Ti becomes sd for allpossible scenarios and remains blocked at sd , thread Ti would not have any wayto transition from sd to any other state. That is why state sd is called the deadlock state for thread Ti .

Avoiding deadlock is one of the challenges of multi-threadedprogramming. There must not be any possibility of deadlock in anapplication. A lock-holding prevention mechanism or the creation oflock hierarchy can remove a deadlock scenario. One recommendation is touse only the appropriate number of lockswhen implementing synchronization. .

Next in Part 2: SynchronizationPrimitives

Thisseries of three articles was excerpted from Multi-CoreProgramming byShameem Akhter and Jason Roberts, published by Intel Press. Copyright© 2006 Intel Corporation. All rights reserved. This book can bepurchased on line.

Shameem Akhter is a platformarchitect at Intel Corp., focusing on singlesocket multicore architecture and performance analysis. Jason Robertsis a senior software engineer at Intel, working on a number ofdifferent multithreaded software products ranging from desktop,handheld and embedded DSP platforms.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.