CSE5317 Project #7 (Code Generation)
This is the final stage of your project in which you are asked to translate PCAT into MIPS code.
You need to download and study
http://www-cse.uta.edu/~fegaras/cse5317/generate.cc
Your task to to add code to the evaluate function
to handle all the cases of PCAT expressions and statements.
You also need to modify your parser.y file to generate code by calling
the evaluate function. For this project, the real goal is to generate correct
MIPS code that runs and computes the correct results of a PCAT program.
Do not attempt to do any optimizations. Do not do any register allocation;
all temporary results must be held in temporary variables in the run-time stack.
Constructing Values
Every data value occupies 4 bytes regardless of its type.
Therefore all offsets must be multiples of 4. More specifically,
this is how you implement each PCAT data type in MIPS:
- An integer occupies 4 bytes (32 bits).
- A real is a single precision floating point (4 bytes). In a real computer, you need
to use the floating point unit to handle these values. But for your
project you will ignore floating points completely (no arithmetic, coercions
etc is necessary).
- A boolean value is represented by an integer (4 bytes) and is false if the integer
is equal to 0, true otherwise.
-
An array is implemented as a pointer to a heap object
(in the dynamic segment, also called the heap).
This pointer can be NIL (i.e. equal to 0) if the array is not initialized.
Arrays are based 0 (i.e. the first element of an array A of n elements
is A[0] and the last is A[n-1]).
The heap object of an array of n elements is a (n+1)*4 bytes
long: the first integer (4 bytes) holds the number of elements of the array
and each of the other integers holds one array element (starting from A[0]).
Suppose that you want to allocate a vector A of 10 elements and set the local
variable with offset -60 to this vector. You need to use the global pointer $gp,
which points to the next available heap byte:
sw $gp, -60($fp) # local variable points to the first free byte
lw $t0, 10
sw $t0, ($gp) # first integer holds the array size
addiu $gp, $gp, 44 # allocate (10+1)*4 bytes on heap
To access A[I], where I is a local variable with offset -48, you do the following:
lw $t0, -60($fp) # $t0 points to the vector
lw $t1, ($t0) # $t1 holds the size of A
lw $t2, -48($fp) # $t2 holds I
ble $t1, $t2, EXIT # if size(A)<=I, goto EXIT (abort program)
addu $t0, $t0, 4 # $t0 points to A[0] now
multu $t1, $t2, 4 # $t1 is I*4
addu $t0, $t0, $t1 # $t0 points to A[I] now
lw $t3, ($t0) # store the value A[I] in $t3
- A record is also implemented as a pointer to a heap object.
This pointer can be NIL (i.e. equal to 0) if the record is not initialized.
If a record has n attributes, then the heap object will occupy
n consecutive integers (n*4 bytes), since every attribute needs 4 bytes).
Accessing record attributes (e.g. in A.X, where A is a record
and X is an attribute of A) is very similar to that
for a vector. The only differences are that you do not store the size of
a record and you do not check for out-of-bounds errors.
All variables are stored in the run-time stack,
including all global variables (you store global variables
in the activation record of the main program).
Lexical Scoping
Each activation record (i.e. frame) for a function or procedure has a fixed size
of 500 bytes. The beginning of an activation record contains the dynamic link (4 bytes),
the return address (the value of register $ra after jal), the
static link (4 bytes), and space for the local variables (4 bytes for each variable).
The rest of the activation record is used to store temporary values.
Consider for example the following PCAT program:
procedure P ( c: integer )
x: integer;
procedure Q ( a, b: integer )
i, j: integer;
begin
x := x+a+j;
end;
begin
Q(x,c);
end;
The activation record for P (as P sees it) is shown in the first figure below:
The activation record for Q (as Q sees it) is shown in the second figure above.
The third figure shows the structure of the run-time stack at the point where
x := x+a+j is executed. This statement uses x, which is defined
in P. We cannot assume that Q called P, so we should not use
the dynamic link to retrieve x; instead, we need to use the static link,
which points to the most recent activation record of P.
Thus, the value of variable x is computed by:
lw $t0, -8($fp) # follow the static link of Q
lw $t1, -12($t0) # x has offset=-12 inside P
Function/procedure arguments should be pushed in the stack before the function call.
If this is a function, then an empty placeholder (4 bytes) should be pushed in the stack
before the function call; this will hold the result of the function.
Procedure/Function Calling Convention
Each procedure/function should begin with the following code (see the factorial.s example):
sw $fp, ($sp) # push old frame pointer (dynamic link)
move $fp, $sp # frame pointer now points to the top of stack
subu $sp, $sp, 500 # allocate 500 bytes in the stack
sw $ra, -4($fp) # save return address in frame
sw $v0, -8($fp) # save static link in frame
and should end with:
lw $ra, -4($fp) # restore return address
move $sp, $fp # pop frame
lw $fp, ($fp) # restore old frame pointer (follow dynamic link)
jr $ra # return
For each procedure call, you need to push the arguments into the stack
and set $v0 to be the right static link (very often it is equal
to the static link of the current procedure; otherwise,
you need to follow the static link a number of times).
For example, the call Q(x,c) in P is translated into:
lw $t0, -12($fp)
sw $t0, ($sp) # push x
subu $sp, $sp, 4
lw $t0, 4($fp)
sw $t0, ($sp) # push c
subu $sp, $sp, 4
move $v0, $fp # load static link in $v0
jal Q # call procedure Q
addu $sp, $sp, 8 # pop stack
Note that there are two different cases for setting the static link before a procedure call.
When the callee is lexically inside the caller's body, we have:
move $v0, $fp
otherwise, we follow the static link of the caller d times, where d is the difference
between the lexical level of the caller from that of the callee. For d=1 we have
lw $v0, -8($fp)
For d=3 we have
lw $t1, -8($fp)
lw $t1, -8($t1)
lw $v0, -8($t1)
Note also that, if you have a call to a function, you need to allocate 4 more bytes in the stack
to hold the result.
What do you Need to Handin
You need to complete the function evaluate in generate.cc
to capture all different ASTs for PCAT expressions and statements (except
floating points). I have already provided code for addition and if-then-else
statement to make it easy for you.
Your info data structure used in your symbol table
should be augmented to hold an integer offset for each variable,
which indicates the offset of this variable in the local activation record.
The address class in generate.cc is used to capture the different address modes
used in MIPS. The generate function is used to generate MIPS assembly code.
Function get_temporary allocates a new temporary variable in the activation
record, which is never reused (you have about 100 available temporary variables
for each procedure, which I think are enough for compiling the test programs;
if they are not enough, just change the constant activation_record_size).
Note that every time you compile a new procedure/function, you need
to set temporary_offset to 0, because it is used for allocating temporary
variables in the activation record.
After you finish evaluate, you change your parser.y to call this
function and to generate code for each procedure/function.
Important: You need to test your parser against all the test files test*.pcat in the
directory tst by testing the generated code.s MIPS file
from each test file (using the SPIM simulator).
After ensuring that your program compiles and executes correctly
on omega/gamma and your MIPS code computes correct values, follow these steps:
- VERY IMPORTANT: Cleanup your parser directory
(remove the parser executable, ALL binaries, and ALL *.o files).
- Change directory one level up using:
cd ..
- Archive and compress your parser directory using:
tar cvf parser.tar parser; gzip parser.tar
- Run the program:
- /public/cse/5317-501/handin-project on gamma, or
- /public/cse/4305-501/handin-project on omega
to electronically hand in your parser.tar.gz file.
Be sure you are in the directory containing this file
when you run the handin program. The handin program will output
the date and time when your program was handed in.
Note that you may run the handin program as many times as you like,
but only the most recently handed in file will be retained.
Last modified: 4/15/98 by Leonidas Fegaras