By Rishiyur S. Nikhil (Bluespec) and Arvind (MIT)
Parallel Haskell (pH) is based on over two decades of research with
several versions of the functional language Id
[10][23] and experience
with their implementation on dataflow and conventional architectures
[11] [49] as well as
alternatives such as
Multithreaded Id
using P-RISC
Graphs and the
MIT Tagged-Token
Dataflow Architecture.
Some years ago, we recognized the semantic similarity between the
functional core of Id and the standard nonstrict functional language
Haskell. We, therefore, decided to adopt Haskell for the functional
core, extending it as necessary with I-structure and M-structure
constructs as shown in Figure 1.11
below.
The Haskell basis should make pH familiar to a
much larger audience and should make exploitation of parallelism an
easy step for programmers already familiar with Haskell.
 |
| Figure
1.11: Layers of pH. |
The Functional Layer
A large subset of pH is its functional core, whose syntax and nonstrict
semantics are exactly those of the pure functional language Haskell.
The major programming style in pH is functional, and because there is
no notion of state, it is trivial to write parallel programs without
races in the functional core. In fact, it is impossible to write a
program with races; functional programs are guaranteed to be
deterministic.
A central feature of the functional core is the higher-order
function. The key idea is that functions themselves can be treated as
values. Higher-order functions increase the parametric power of a
language and, consequently, allow us to write shorter, more abstract
programs.
For example, a function to sort a list can take, as one of its
arguments, the function used to decide the relative order of any pair
of elements in the list. Then, the same sorting function can be used to
sort a list of integers both in ascending and descending orders, simply
by passing in the "<" or ">" functions, respectively.
Or, the same sorting function can be used to sort a list of strings
in lexicographic order, by passing in the appropriate string comparison
function. The experienced functional programmer uses higher order
functions like oxygen - they are an essential part of one's toolkit.
We have already mentioned the notion of nonstrictness, another
central feature of pH. Functions can be executed before arguments are
available (unless restricted by data
dependencies) and can even return
results before the body has completed computing.
Similarly, any component of a data structure may be empty for an
indefinite period, allowing producers and consumers to compute
concurrently. We motivated this idea from the point of view of
producer-consumer parallelism, but this feature also fundamentally adds
expressive power to the language, allowing us to write certain
recursive definitions of data structures that would otherwise be
awkward to express.
Studies indicate that the practical advantages of nonstrictness
(producer -consumer parallelism,
definition of cyclic data structures)
may accrue even with strict functions, as long as data structures are
nonstrict. This could have significant performance advantages-not only
can strict function calls be implemented more efficiently, but they
also permit better compiler analysis of function bodies, resulting in
better code therein. This is still an open area of research and is a
possible future for pH.
pH has types and static type checking. Pervading the entire language
is the notion of types and static type checking, although this is
orthogonal to parallelism. Every phrase in pH (variable, expression,
function, and so on) must have a well-defined data type, and
this
property is checked statically (i.e.,
by the compiler) before running
the program.
Static types facilitate the construction of robust programs because
many simple programming errors are caught at compile time, and because
they give a convenient framework in which to organize the program.
Static types have sometimes earned a bad reputation for increasing
the "bureaucracy" that the programmer has to deal with; however, there
are three aspects of pH's type system that make it easy to use compared
to, say, the static type system of Pascal or C.
The first is polymorphism: for example, we can write a single
function that computes the length of a list, no matter what the list
contains (integers, strings, ...),
whereas in Pascal we would have to
write several (nearly identical)
functions for this purpose, one for
integer lists, one for string lists, and so on.
The second convenient feature is type inference: the programmer is
relieved of the tedium of having to predeclare the type of every
variable used in the program. Instead, the compiler infers these types
automatically and, in fact, it is perfectly safe for the programmer to
omit all type declarations (although
it is highly recommended that the
programmer declare the types of at least the top-level functions).
The third convenience is overloading whereby, for example, we can
write a single function to multiply two integer or floating-point
matrices, whereas in Pascal we would have to write two (nearly
identical) functions for this purpose. Overall, the static type system
is invaluable in structuring the way we think about and design
programs.