# COMPLETS

Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall



© 2021 Carl Alphonce - Reproduction

#### 8.6 A Simple Code Generator [p. 542]

algorithm focuses on generation of code for a single basic block

- generates code for each three address code instruction
- manages register allocations/ assignment to avoid redundant loads/stores

### Principal uses of registers

- o operator operands
- o temporaries needed within block
- o variables that span multiple blocks
- o stack pointer
- o function arguments

"We [...] assume that for each operator, there is exactly one machine instruction that takes the necessary operands in registers and performs that operation, leaving the result in a register. The machine instructions are of the form:

| 0 | LD reg, mem                | movl | MEM, | REG |
|---|----------------------------|------|------|-----|
| Ø | ST mem, reg                | movl | REG, | MEM |
| ð | 0P reg, reg, reg" [p. 543] | addl | REG, | REG |

x86 assembly resources (will add more as we go along)
https://en.wikipedia.org/wiki/X86\_assembly\_language
https://gcc-renesas.com/pdf/manuals/Assembler.pdf
man as <--- at OS prompt</pre>

8.6.1 Register and Address Descriptors A three-address instruction of the form:

v=aopb

we generate: LD Rx, a LD Ry, b OP Rx, Rx, Ry ST Rx, v

8.6.1 Register and Address Descriptors A three-address instruction of the form: where a, b, and v are int v=aopb v = a + bin x80 we generate: LD Rx, a movl -4(%rbp), %edx LD Ry, b -8(%rbp), %eax movl OP RX, RX, RY addl %edx, %eax %eax, -12(%rbp) ST RX, V movl

an int is 32 bits wide

where a, b, and v are int v = a + b

> asm v8(

#### the 'l' in instructions indicate 32 bits

movl -4(%rbp), %edx movl -8(%rbp), %eax addl %edx, %eax movl %eax, -12(%rbp)

these offsets are
stored in symbol table

you can use easier register names, then print them with proper names

 This results in many redundant loads and stores and may not make effective use of available registers.

 To better manage register use, employ two data structures:

- register descriptor

- address descriptor

## register descriptor

"For each available register, a register descriptor keeps track of the variable names whose current value is in that register." [p. 543]

# address descriptor

"For each program variable, an address descriptor keeps track of the location or locations where the current value of that variable can be found." [p. 543]

### gelkes function

"...getReg(I)...selects registers for each memory location associated with the three-address instruction I." [p. 544]

> Note that I is an instruction, not a variable!

### (paraphrased from 8.6.2, page 544)

- A three-address instruction of the form:  $v = a \ op b$
- Use getReg(v = a op b) to select registers for v, a and b: Rv, Ra, and Rb respectively
- 2. If a is not already in Ra, generate LD Ra, a' (where a' is one of the possibly many current locations of a)
- 3. Similarly for b.
- 4. Generate OP RV, Ra, Rb

#### copy instructions x = y

"We assume getReg will always choose the same register for both x and y. If y is not already in that register Ry, then generate the machine instruction LD Ry, y. If y was already in Ry, we do nothing. It is only necessary that we adjust the register descriptor for Ry so that it includes x as one of the values found there." [p. 544]

Writing back to memory at end of block

At the end of a basic block we must ensure that live variables are stored back into memory.

"...for each variable x whose address descriptor does not say that its value is located in the memory location for x, we must generate the instruction ST x, R, where R is a register in which x's value exists at the end of the block." [p. 545]

Updating register descriptors (RD) and address descriptors (AD)

1. LD R, x

(a) set RD of R to only x

(b) Add R to AD of x

(c) Remove R from the AD of any variable other than x 2. ST x, R

(a) Add &x to AD of x

3. OP RX, RY, RZ for 
$$x = y \circ p z$$

(a) set RD of Rx to only x

(b) set AD of x to only Rx (&x not in AD of x !)

(c) Remove Rx from the AD of any variable other than x 4. "When we process a copy statement x = y, after generating the Load for y into register Ry, if needed, and after managing descriptors as for all Load statement (per rule 1):" [p. 545]

(a) Add x to the RD of Ry (b) set AD of x to only Ry



what does liveness and next use info looking like here?

t = a - b u = a - c v = t + u a = dd = v + u

Algorithm 8.7 [p. 528] Determining the liveness and next-use information for each statement in a basic block.

INPUT: A basic block B of three address instructions. Assume the symbol table initially shows all non-temporary variables in B as being live on exit. Not this instruction specifically, but instructions of the form

 $x = y \circ p z, x = \circ p y, or x = y.$ 

OUTPUT: At each statement i: x = y + z in B, we attach to i the liveness and next-use information for x, y, and z.

METHOD: We start at the last statement in B and scan backwards to the beginning of B. At each statement i: x = y + z in B do the following:

1) attach to statement i the information currently found in the symbol table regarding the next-use and liveness of x, y, and Z.

2) In the symbol table, set x to "not live" and "no next use".

3) In the symbol table, set y and z to "live" and the next uses of y and z to instruction i.

Next uses of y and z to instruction i.  $\odot$  2021 Carl Alphonce – Reproduction of this material is prohibited without the author's consent

# Example [p. 546]



INPUT: A basic block B of three address instructions. Assume the symbol table initially shows all non-temporary variables in B as being live on exit.

| a | b | С | d | Ŀ | u | V |
|---|---|---|---|---|---|---|
| L | L | L | L |   |   |   |
|   |   |   |   |   |   |   |