CSE5317 Project #7 (Code Generation)

This is the final stage of your project in which you are asked to translate PCAT into MIPS code. You need to download and study

http://www-cse.uta.edu/~fegaras/cse5317/generate.cc Your task to to add code to the evaluate function to handle all the cases of PCAT expressions and statements. You also need to modify your parser.y file to generate code by calling the evaluate function. For this project, the real goal is to generate correct MIPS code that runs and computes the correct results of a PCAT program. Do not attempt to do any optimizations. Do not do any register allocation; all temporary results must be held in temporary variables in the run-time stack.

Constructing Values

Every data value occupies 4 bytes regardless of its type. Therefore all offsets must be multiples of 4. More specifically, this is how you implement each PCAT data type in MIPS:

An integer occupies 4 bytes (32 bits).
A real is a single precision floating point (4 bytes). In a real computer, you need to use the floating point unit to handle these values. But for your project you will ignore floating points completely (no arithmetic, coercions etc is necessary).
A boolean value is represented by an integer (4 bytes) and is false if the integer is equal to 0, true otherwise.
An array is implemented as a pointer to a heap object (in the dynamic segment, also called the heap). This pointer can be NIL (i.e. equal to 0) if the array is not initialized. Arrays are based 0 (i.e. the first element of an array A of n elements is A[0] and the last is A[n-1]). The heap object of an array of n elements is a (n+1)*4 bytes long: the first integer (4 bytes) holds the number of elements of the array and each of the other integers holds one array element (starting from A[0]). Suppose that you want to allocate a vector A of 10 elements and set the local variable with offset -60 to this vector. You need to use the global pointer $gp, which points to the next available heap byte:
```
	sw	$gp, -60($fp)	# local variable points to the first free byte
	lw	$t0, 10
	sw	$t0, ($gp)	# first integer holds the array size
	addiu	$gp, $gp, 44	# allocate (10+1)*4 bytes on heap
```
To access A[I], where I is a local variable with offset -48, you do the following:
```
	lw	$t0, -60($fp)	# $t0 points to the vector
	lw	$t1, ($t0)	# $t1 holds the size of A
	lw	$t2, -48($fp)	# $t2 holds I
	ble	$t1, $t2, EXIT	# if size(A)<=I, goto EXIT (abort program)
	addu	$t0, $t0, 4	# $t0 points to A[0] now
	multu	$t1, $t2, 4	# $t1 is I*4
	addu	$t0, $t0, $t1	# $t0 points to A[I] now
	lw 	$t3, ($t0)	# store the value A[I] in $t3
```
A record is also implemented as a pointer to a heap object. This pointer can be NIL (i.e. equal to 0) if the record is not initialized. If a record has n attributes, then the heap object will occupy n consecutive integers (n*4 bytes), since every attribute needs 4 bytes). Accessing record attributes (e.g. in A.X, where A is a record and X is an attribute of A) is very similar to that for a vector. The only differences are that you do not store the size of a record and you do not check for out-of-bounds errors.

All variables are stored in the run-time stack, including all global variables (you store global variables in the activation record of the main program).

Lexical Scoping

Each activation record (i.e. frame) for a function or procedure has a fixed size of 500 bytes. The beginning of an activation record contains the dynamic link (4 bytes), the return address (the value of register $ra after jal), the static link (4 bytes), and space for the local variables (4 bytes for each variable). The rest of the activation record is used to store temporary values. Consider for example the following PCAT program:

procedure P ( c: integer )
  x: integer;

  procedure Q ( a, b: integer )
  i, j: integer;
  begin
    x := x+a+j;
  end;

begin
  Q(x,c);
end;

The activation record for P (as P sees it) is shown in the first figure below:

The activation record for Q (as Q sees it) is shown in the second figure above. The third figure shows the structure of the run-time stack at the point where x := x+a+j is executed. This statement uses x, which is defined in P. We cannot assume that Q called P, so we should not use the dynamic link to retrieve x; instead, we need to use the static link, which points to the most recent activation record of P. Thus, the value of variable x is computed by:

	lw	$t0, -8($fp)		# follow the static link of Q
	lw	$t1, -12($t0)		# x has offset=-12 inside P

Function/procedure arguments should be pushed in the stack before the function call. If this is a function, then an empty placeholder (4 bytes) should be pushed in the stack before the function call; this will hold the result of the function.

Procedure/Function Calling Convention

Each procedure/function should begin with the following code (see the factorial.s example):

        sw      $fp, ($sp)      # push old frame pointer (dynamic link)
        move    $fp, $sp        # frame pointer now points to the top of stack
        subu    $sp, $sp, 500   # allocate 500 bytes in the stack
        sw      $ra, -4($fp)	# save return address in frame
        sw      $v0, -8($fp)	# save static link in frame

and should end with:

        lw      $ra, -4($fp)	# restore return address
        move    $sp, $fp	# pop frame
        lw      $fp, ($fp)      # restore old frame pointer (follow dynamic link)
        jr      $ra		# return

For each procedure call, you need to push the arguments into the stack and set $v0 to be the right static link (very often it is equal to the static link of the current procedure; otherwise, you need to follow the static link a number of times). For example, the call Q(x,c) in P is translated into:

        lw      $t0, -12($fp)
        sw      $t0, ($sp)	# push x
        subu    $sp, $sp, 4
        lw      $t0, 4($fp)
        sw      $t0, ($sp)	# push c
        subu    $sp, $sp, 4
        move    $v0, $fp	# load static link in $v0
        jal     Q		# call procedure Q
	addu	$sp, $sp, 8	# pop stack

Note that there are two different cases for setting the static link before a procedure call. When the callee is lexically inside the caller's body, we have:

        move    $v0, $fp

otherwise, we follow the static link of the caller d times, where d is the difference between the lexical level of the caller from that of the callee. For d=1 we have

	lw	$v0, -8($fp)

For d=3 we have

	lw	$t1, -8($fp)
	lw	$t1, -8($t1)
	lw	$v0, -8($t1)

Note also that, if you have a call to a function, you need to allocate 4 more bytes in the stack to hold the result.

What do you Need to Handin

You need to complete the function evaluate in generate.cc to capture all different ASTs for PCAT expressions and statements (except floating points). I have already provided code for addition and if-then-else statement to make it easy for you. Your info data structure used in your symbol table should be augmented to hold an integer offset for each variable, which indicates the offset of this variable in the local activation record. The address class in generate.cc is used to capture the different address modes used in MIPS. The generate function is used to generate MIPS assembly code. Function get_temporary allocates a new temporary variable in the activation record, which is never reused (you have about 100 available temporary variables for each procedure, which I think are enough for compiling the test programs; if they are not enough, just change the constant activation_record_size). Note that every time you compile a new procedure/function, you need to set temporary_offset to 0, because it is used for allocating temporary variables in the activation record. After you finish evaluate, you change your parser.y to call this function and to generate code for each procedure/function.

Important: You need to test your parser against all the test files test*.pcat in the directory tst by testing the generated code.s MIPS file from each test file (using the SPIM simulator).

After ensuring that your program compiles and executes correctly on omega/gamma and your MIPS code computes correct values, follow these steps:

VERY IMPORTANT: Cleanup your parser directory (remove the parser executable, ALL binaries, and ALL *.o files).
Change directory one level up using:
```
              cd ..
```

Archive and compress your parser directory using:

              tar cvf parser.tar parser; gzip parser.tar

Run the program:
- /public/cse/5317-501/handin-project on gamma, or
- /public/cse/4305-501/handin-project on omega
to electronically hand in your parser.tar.gz file. Be sure you are in the directory containing this file when you run the handin program. The handin program will output the date and time when your program was handed in. Note that you may run the handin program as many times as you like, but only the most recently handed in file will be retained.

Last modified: 4/15/98 by Leonidas Fegaras