Threading through the parallel code optimization maze: Part 1
Accomplishing tasks in parallel comes in many guises. At the highest level, multiple computers can divide a task and work on it collectively. An example of this type of parallelism is the SETI@home [1] project where thousands of computers work collectively on analyzing radio signals for patterns indicative of intelligence. At the lowest level, modern microprocessors execute instructions specified in its machine language in parallel.
For example, the Intel Pentium processor has a U execution pipe and a V execution pipe capable of executing different instructions at the same time. The motivation for enabling parallel execution at these different levels of computation is performance.
Furthermore, the use of one type of parallel execution does not negate the benefit of employing other types of parallel execution. Threads are another means of enabling parallelism that complements the already mentioned types and is of smaller granularity than the process level concurrency mentioned in the SETI@home example, but certainly of larger granularity than the instruction level parallelism example of the Pentium processor.
Threads offer the ability to have multiple sections of one task in fl ight at the same time. In technical-speak: threads offer the ability to concurrently execute multiple contexts of the same process.
A real life analogy helps illustrate how threading relates to the other two types of parallelism. Consider the task of harvesting a crop from a farm and a worker as one tractor. The farm is split into multiple fields and a tractor only works on one field.
In this example, harvesting the crop is the task (e.g., the work that needs completion). If there is only one tractor per field, there is no need to explicitly coordinate the work of the tractors. Each tractor has their preassigned space to harvest and can complete the task independently of the other tractors.
Now, assume a second tractor is added to work one field. In this case the two tractors need to be concerned about what each other is doing. There are basic issues of feasibility " how do the tractors keep from running into each other?
There are issues of optimizing performance " how is the work split between the tractors to maximize performance and harvest as quickly as possible? This real life example is similar to threading. The threads are akin to multiple tractors tending to one field and similar issues exist. Threads must coordinate their work properly and efficiently or risk crashes and low performance.
In the previous example as long as there was only one tractor on each field they were safe. This is similar to the operating system (OS) concept of a process. Two processes do not share the memory that the OS assigns to them so the chances of them interfering with each other are minimized. ( Processes physically share the memory on the system, but the OS makes it appear to the process to be disjoint memory regions. Modern OSes will throw an exception if a process attempts to access memory outside of its assigned memory.
In terms of scheduling the processes, the OS determines where and when each process will execute. With regard to threads, the software that is either implemented by the programmer or provided by a library that abstracts some of this low level work must coordinate the threads. Nevertheless, designing for threads requires a disciplined process for best effect.
In order to take advantage of threading, the software development process detailed in this two part series is recommended. The Threading Development Cycle (TDC) focuses on the specific needs and potential challenges introduced by threads. It is comprised of the following steps:
1. Analyze the application
2. Design and implement the threading
3. Debug the code
4. Performance tune the code.
The first two parts in this series are specific to cases where the OS provides software threads and shared memory although the steps detailed in the TDC can be applied in the more general case. Before taking a deeper look at the TDC, some basic concepts of parallel programming are explained.


Loading comments... Write a comment