Step 7: Liveness Analysis and Register Allocation - Due Date: Monday Dec. 7th, 11:59 pm

In this step, we will implement liveness analysis and apply it to register allocation algorithm to avoid unnecessary spilling of variables if the variable is dead.

You may find this set of hints from a couple of years ago useful.

Liveness Analysis

Liveness analysis is a data flow analysis that finds what variables are live after any statement. It is an any-path, backward flow analysis. We will perform intra-procedural liveness analysis (i.e., we will compute liveness at the function granularity, but not across functions) at the IR Node level (i.e., we will compute liveness based on IR instructions, not Tiny instructions).

Control flow graphs

The first step in computing liveness is to build a control flow graph for each function in your program. To represent your control flow graph, each IR Node should know its successors (IR instructions that could possibly execute immediately after it) and predecessors (IR instructions that could possible execute immediately before it). Conditional jumps have two successors: the explicit target of the jump, and the implicit (fall-through) target of the jump. Unconditional jumps only have one successor. Function calls should be treated as straight-line IR nodes (i.e., they are not treated as branches; their successor is the instruction immediately after the call). Return nodes do not have any successors.

Liveness across basic blocks

For each IR node in a function, you should define two sets: GEN and KILL. GEN represents all the temporaries and variables that are used in an instruction, and KILL represents all the temporaries and variables that are defined in an instruction. For most instructions, this should be pretty straightforward. A few tricky cases:

PUSH instructions use the variable/temporary being pushed
POP instructions define the variable/temporary being popped
WRITE instructions use their variables.
READ instructions define their variables.
CALL instructions require special care. Because we do not analyze liveness across functions, we must make conservative assumptions about what happens function calls. In particular, we GEN any variables that may be used, and KILL any variables that must be used. The GEN set for any CALL instruction therefore contains all global variables, while the KILL set is empty.

Once you know the GEN and KILL sets for each IR node, you can compute liveness. To do this, define IN (live-in) and OUT (live-out) sets for each IR Node. Initialize the OUT sets for RETURN IR nodes to all global variables (because global variables may be used after the function returns), and initialize all other sets to empty. Then compute the live-in and live-out sets for each IR node as follows:

The set of variables that are live out of a node is the union of all the variables that are live in to the node's successors.
The set of variables that are live in to a node is the set of variables that are live out for the node, minus any variables that are killed by the node, plus any variables that are gen-ed by the node.

Note that in these definitions are recursive: the live-out set of a node is defined in terms of the live-in sets of its successors, which are in turn defined in terms of the live-in sets of their successors, and so on. If there is a loop in the code, then the definition seems circular.

The trick to computing liveness (and we will discuss this in more detail in class) is to compute a fixpoint: assignments to each of the live-in and live-out sets so that if you try to compute any node's live-in or live-out set again, you'll get the same result you already have. To do this, we will use a worklist algorithm:

Put all of the IR nodes on the worklist
Pull an IR node off the worklist, and compute its live-out and live-in sets according to the definitions above.
If the live-in set of the node gets updated by the previous step, put all of the node's predecessors on the worklist (because they may need to update their live-out sets).
Repeat steps 2 and 3 until the worklist is empty.
The live-out sets of each node now represent a fixpoint.

In class, we will discuss why this algorithm works.

Register Allocation Algorithm

Use the bottom-up register allocation algorithm discussed in class. For each statement, you must ensure that the source operands are in registers, and that there is a register for the destination operand. Use the liveness information you computed (i.e., the live-out set for the instruction) to determine when it is safe to free registers, and when a dirty register needs to be stored back to memory (only when the variable in the register is live).

Bottom-up register allocation works at the basic-block level: any register allocation decisions you make apply for the current basic block only. This means that when you get to the end of a basic block, you must reset your register allocation. Any register that (a) hold local/global variables and (b) is dirty should be written back to the stack/global variable.

Note also that because a CALL instruction jumps into another method, any global variables that are in registers when the CALL is performed should be freed immediately prior to the CALL instruction, ensuring that the correct value for the global is in memory. This is different from saving the registers on the stack prior to a function call. The latter is done so that the caller method doesn't get its registers overwritten; the values of the registers are stored where only the caller can see them. The former is done so that the callee method sees the right values for global variables; the values need to be stored back to globals so that everyone can see them, and freed from the registers so that the caller will reload them after the callee returns.

Testcases

We will test your compiler against all of the test cases from steps 4, 5, and 6. The functional behavior of the programs will remain the same, but your generated Tiny should only use 4 registers. Sample inputs and outputs are available.

Reference compiler

A 4-register version of the reference compiler can be found here.

Tiny Simulator

Please use the version of the tiny simulator supporting only 4 registers [C++ source code] for this step.

Grading

As before, we will grade your compiler by running your outputs through the Tiny simulator and comparing the results to the reference solution. You do not have to match code or IR exactly. Your generated code only has to produce the same results as the outputs.

We will assign bonus points based on code performance in this step as well. Unlike in prior steps, where we counted total execution cycles, in this step, we will assign points based on performing the fewest memory operations. In other words, compilers that do a better job of register allocation will receive bonus points.