Common multicore programming problems: Part 1 -Threads, data races, deadlocks and live locks

Shameem Akhter and Jason Roberts, Intel Corp.

October 26, 2007

Shameem Akhter and Jason Roberts, Intel Corp.

Parallel programming has been around for decades, though before the advent of multi-core processors, it was an esoteric discipline. Numerous programmers have tripped over the common stumbling blocks by now. By recognizing these problems you can avoid stumbling.

Furthermore, it is important to understand the common problems before designing a parallel program, because many of the problems arise from the overall decomposition of the program, and cannot be easily patched later. This series of articles looks at some of these common problems, their symptoms, and ways to circumvent them.

Too Many Threads
It may seem that if a little threading is good, then a lot must be better. In fact, having too many threads can seriously degrade program perform-ance. The impact comes in two ways. First, partitioning a fixed amount of work among too many threads gives each thread too little work, so that the overhead of starting and terminating threads swamps the useful work. Second, having too many concurrent software threads incurs overhead from having to share fixed hardware resources.

When there are more software threads than hardware threads, the operating system typically resorts to round robin scheduling. The scheduler gives each software thread a short turn, called a time slice, to run on one of the hardware threads.

When a software thread's time slice runs out, the scheduler preemptively suspends the thread in order to run another software thread on the same hardware thread. The software thread freezes in time until it gets another time slice.

Time slicing ensures that all software threads make some progress. Otherwise, some software threads might hog all the hardware threads and starve other software threads. However, this equitable distribution of hardware threads incurs overhead. When there are too many software threads, the overhead can severely degrade performance. There are several kinds of overhead, and it helps to know the culprits so you can spot them when they appear.

Thread register state. The most obvious overhead is the process of saving and restoring a thread's register state. Suspending a software thread requires saving the register values of the hardware thread, so the values can be restored later, when the software thread resumes on its next time slice. Typically, thread schedulers allocate big enough time slices so that the save/restore overheads for registers are insignificant, so this obvious overhead is in fact not much of a concern.

A more subtle overhead of time slicing is saving and restoring a thread's cache state. Modern processors rely heavily on cache memory, which can be about 10 to 100 times faster than main memory. Accesses that hit in cache are not only much faster; they also consume no bandwidth of the memory bus. Caches are fast, but finite.

When the cache is full, a processor must evict data from the cache to make room for new data. Typically, the choice for eviction is the least recently used data, which more often than not is data from an earlier time slice. Thus threads tend to evict each other's data. The net effect is that too many threads hurt performance by fighting each other for cache.

< Previous
Page 1 of 4
Next >

Loading comments...

Most Commented

  • Currently no items

Parts Search Datasheets.com

KNOWLEDGE CENTER