3.
Tiny Machine Runtime Environment
The
following description assumes a knowledge of the Tiny Machine as given in
Section
8.7 and an understanding of stack-based runtime environments from Chapter
7. Since C- (unlike TINY) has recursive procedures, the runtime
environment must be stack based. The environment consists of a global area
at the top of dMem, and the stack just below it, growing downward toward
0. Since C- contains no pointers or dynamic allocation, there is no need
for a heap. The basic organization of each activation record (or stack
frame) in C- is:
ofp
ret
|
…
params
…
|
…
local
vars
…
|
…
temps
…
|
<-
fp points here
ofpFO = 0
retFO = -1
initFO = -2
Here,
fp is the current frame pointer, which is kept in a register for
easy access. The ofp (old frame pointer) is the control link as
discussed in Chapter 7 of the text. The constants to the right ending in
FO (for frame offset) are the offsets at which each of the indicated
quantities are stored. The value initFO is the offset at which the params
and local vars begin their storage areas in an activation record. Since
the Tiny Machine contains no stack pointer, all references to fields
within an activation record will use the fp, with negative frame offsets.
For
example, if we have the following C- function declaration,
int
f(int x, int y)
{
int
z;
.
. .
}
then
x, y, and z must be allocated in the current frame,
and the frame offset at the beginning of the generation of code for the
body of f will be -5 (one location each for x, y, z
and two locations for the bookkeeping information of the activation
record). The offsets of x, y, and z are -2, -3, and
-4, respectively.
Global
references can be found at absolute locations in memory. Nevertheless, as
with
TINY, we prefer also to reference these variables by offset from a
register. We do this by keeping a fixed register, which we call the gp,
and which always points at the maximum address. Since the TM simulator
stores this address into location 0 before execution begins, the gp can be
loaded from location 0 on start-up, and the following standard prelude
initializes the runtime environment:
O:
LD gp,
0(ac) * load gp with maxaddress
1:
LDA fp, 0(gp)
* copy gp to fp
2:
ST ac, 0(ac) * clear
location 0
Function
calls also require that the beginning code location for their bodies be
used in a calling sequence. We also prefer to call functions by performing
a relative jump using the current value of the pc rather than an absolute
jump. (This makes the code potentially relocatable.) The utility procedure
emitRAbs in code.h/code.c can be used for this purpose. (It takes an
absolute code location and relativizes it by using the current code
generation location.)
For
example, suppose we want to call a function f whose code begins at
location 27, and we are currently at location 42. Then instead of
generating the absolute jump
42:
LDC pc, 27(*)
we
would generate
42:
LDA pc, -16(pc)
since
27 - (42 + 1) = -16.
The
Calling Sequence
A reasonable division of labor between caller and callee is to have the
caller store the values of the arguments in the new frame and create the
new frame, except for the storing of the return pointer in the retFO
position. Instead of storing the return pointer itself, the caller leaves
it in the ac register, and the callee stores it into the new frame. Thus,
every function body must begin with code to store that value in the (now
current) frame:
ST
ac, retFO(fp)
This
saves one instruction at each call site. On return, each function then
loads the pc with this return address by executing the instruction
LD
pc, retFO(fp)
Correspondingly,
the caller computes the arguments one by one, and pushes each onto the
stack in its appropriate position before pushing the new frame. The caller
must also save the current fp into the frame at ofpFO before pushing the
new frame. After a return from the callee, the caller then discards the
new frame by loading the fp with the old fp. Thus, a call to a function of
two parameters will cause the generation of the following code:
<code
to compute first arg>
ST
ac, frameoffset+initFO (fp)
<code
to compute second arg>
ST
ac, frameoffset+initFO-1 (fp)
ST
fp, frameoffset+ofpFO (fp) * store current fp
LDA
fp frameOffset(fp) * push new frame
LDA
ac,l(pc) * save return in ac
LDA
pc, . . .(pc) * relative jump to function entry
LD
fp, ofpFO(fp) * pop current frame
Address
Computations
Since both variables and subscripted arrays are allowed on the lefthand
side of assignment, we must distinguish between addresses and values
during compilation. For example, in the statement
a[i]
:= a[i+l];
the
expression a[i] refers to the address of a[i], while the
expression a[i+l]
refers
to the value of a at the i+1 location. This distinction can
be achieved by using an isAddress parameter to the cGen
procedure. When this parameter is true, cGen generates code to
compute the address of a variable rather than its value. In the case of a
simple variable, this means adding the offset to either the gp (in the
case of a global variable) or the fp (in the case of a local variable) and
loading the result into the ac:
LDA
ac, off set (fp) * put address of local var in ac
In
the case of an array variable, this means adding the value of the index to
the base address of the array and loading the result into the ac, as
described below.
Arrays
Arrays are allocated on the stack beginning at the current frame offset
and extending downward in memory in order of increasing subscript, as
follows:
…
|
A
[0]
|
A
[1]
|
A
[2]
|
etc.
…
|
<- base address of array A
allocated space for elements
of
array A in the stack
Note
that array locations are computed by subtracting the index value from the
base address.
When
an array is passed to a function, only the base address is passed. The
allocation of the area for the base elements is done only once and remains
fixed for the life of the array. Function arguments do not include the
actual elements of an array, only the address. Thus, array parameters
become reference parameters. This causes an anomaly when array parameters
are referenced inside a function, since they must be treated as having
their base addresses rather than their values stored in memory. Thus, an
array parameter has its base address computed using an LD operation
instead of an LDA.
|