Another major code generation problem is the creation of procedures.
Generating code for procedures is relatively straightforward once we
know the procedure linkage appropriate for the CPU. At the procedure
definition, we generate the code to handle the procedure call and
return. At each call of the procedure, we set up the procedure
parameters and make the call.
The CPU's subroutine call
mechanism is usually not
sufficient to directly support procedures in modern programming
languages. The procedure linkage mechanism provides a way for the
program to
pass parameters into the program and for the procedure to return a
value.
It also provides help in restoring the values of
registers that
the procedure has modified. All procedures in a given programming
language use the same linkage mechanism (although different languages
may use different linkages). The mechanism can also be used to call
handwritten assembly language routines from compiled code.
Procedure stacks are typically built to grow down from
high addresses. A stack pointer (sp) defines the end
of the current frame, while a frame pointer (fp)
defines the end of the last frame. (The fp is technically necessary
only if the stack frame can be grown by the procedure during
execution.)
The procedure can refer to an element in the frame by
addressing relative to sp. When a new procedure is called, the sp and
fp are modified to push another frame onto the stack.
The ARM Procedure Call Standard (APCS) is a good
illustration of a typical procedure linkage mechanism. Although the
stack frames are in main memory, understanding how registers are used
is key to understanding the mechanism, as explained below.
- r0"r3 are used to pass parameters into the
procedure. r0 is also used to hold the return value. If more than four
parameters are required, they are put on the stack frame.
- r4"r7 hold register variables.
- r11 is the frame pointer and r13 is the stack
pointer.
- r10 holds the limiting address on stack size,
which is used to check for stack overflows.
Other registers have additional uses in the protocol.
 |
| Figure
5-14. Layout of a one-dimensional array in memory |
Data Structures
The compiler must also translate references to data structures into
references to raw memories. In general, this requires address
computations. Some of these computations can be done at compile time
while others must be done at run time. arrays Arrays are interesting
because the address of an array element must in general be computed at
run time, since the array index may change. Let us first consider
one-dimensional arrays:
a[i]
The layout of the array in memory is shown in Figure
5-14 above. The zeroth element is stored as the first element of
the array,
the first element directly below, and so on. We can create a pointer
for the array that points to the array's head, namely, a[0]. If we call
that pointer aptr for convenience, then we can rewrite the reading
of a
[i] as
*(aptr + i)
Two-dimensional arrays are more challenging. There are
multiple possible ways to lay out a two-dimensional array in memory, as
shown in Figure 5-15 below. In
this form, which is known as row major,
the inner variable of the array (j in a[i,j]) varies most quickly.
(Fortran uses a different organization known as column major.)
Two-dimensional arrays also require more sophisticated
addressing - in
particular, we must know the size of the array. Let us consider the
row-major form. If the a[] array is of size N × M, then we can
turn the two-dimensional array access into a one-dimensional array
access. Thus,
a[i,j]
becomes
a[i*M + j]
where the maximum value for j is M ' 1.
 |
| Figure
5-15. Memory layout for two-dimensional arrays |
A C struct is easier to address. As shown in Figure
5-16 below, a structure is implemented as a contiguous block of
memory.
Fields in the structure can be accessed using constant offsets to the
base address of the structure. In this example, if field1 is four bytes
long, then field2 can be accessed as
*(aptr + 4)
This addition can usually be done at compile time,
requiring only the indirection itself to fetch the memory location
during execution.
Expression
Simplification
Expression simplification is a useful area for machine-independent
transformations. We can use the laws of algebra to simplify
expressions. Consider the following expression:
a*b + a*c
We can use the distributive law to rewrite the
expression as
a*(b + c)
 |
| Figure
5-16. C structure layout and access. |
Since the new expression has only two operations rather
than three for the original form, it is almost certainly cheaper,
because it is both faster and smaller. Such transformations make some
broad assumptions about the relative cost of operations.
In some cases,
simple generalizations about the cost of operations may be misleading.
For example, a CPU with a multiply-andaccumulate instruction may be
able to do a multiply and addition as cheaply as it can do an addition.
However, such situations can often be taken care of in code generation.
We can also use the laws of arithmetic to further
simplify expressions on constants. Consider the following C statement:
for (i = 0; i < 8 + 1; i++)
We can simplify 8 + 1 to 9 at compile time - there is
no
need to perform that arithmetic while the program is executing. Why
would a program ever contain expressions that evaluate to constants?
Using named constants rather than numbers is good programming practice
and often leads to constant expression. The original form of the for
statement could have been
for (i = 0; i < NOPS + 1; i++)
where, for example, the added 1 takes care of a
trailing null character.