Intermediate code
generation
Intermediate code generation
Translating source program into an “intermediate language.”
Simple
CPU Independent,
Benefits
1. Retargeting is facilitated
2. Machine independent Code Optimization can be applied.
Types of Intermediate languages
Intermediate language can be many
different languages, and the designer of
the compiler decides this intermediate
language.
syntax trees
postfix notation
three-address code
Syntax Trees
a+a*(b-c)+(b-c)*d
Syntax Tree DAG
a:=b*-c+b*-c
Syntax Tree DAG
assign assign
+
a + a
*
* *
b uminus uminus
uminus b
c c
b c
2. Postfix Notation
Form Rules:
1. If E is a variable/constant, the PN of E is E
itself
2. If E is an expression of the form E1 op E2,
the PN of E is E1 ’E2 ’op (E1 ’ and E2 ’ are the
PN of E1 and E2, respectively.)
3. If E is a parenthesized expression of form
(E1), the PN of E is the same as the PN of E1.
Example
(a+b)/(c-d)
Postfix: ab+cd-/
3. Three address code
Statements of general form x:=y op z
No built-up arithmetic expressions are allowed.
As a result, x:=y + z * w
should be represented as
t1:=z * w
t2:=y + t1
x:=t2
Example
a:=b*-c+b*-c
t1:=- c
t2:=b * t1
t3:=- c
t4:=b * t3
t5:=t2 + t4
a:=t5
Types of Three-Address Statements
Assignment Statement: x:=y op z
Assignment Statement: x:=op z
Copy Statement: x:=z
Unconditional Jump: goto L
Conditional Jump: if x relop y goto L
Stack Operations: Push/pop
More Advanced:
Procedure:
param x1
param x2
…
param xn
call p,n
Index Assignments:
x:=y[i]
x[i]:=y
Address and Pointer Assignments:
x:=&y
x:=*y
*x:=y
Implementations of 3-address
statements
Quadruples
Triples
Indirect triples
Quadruples
a:=b*-c+b*-c op arg1 arg2 result
t1:=- c (0) uminus c t1
t2:=b * t1 (1) * b t1 t2
t3:=- c (2) uminus c
t4:=b * t3
(3) * b t3 t4
t5:=t2 + t4
(4) + t2 t4 t5
a:=t5
(5) := t5 a
Temporary names must be entered into the symbol
table as they are created.
Triples
op arg1 arg2
a:=b*-c+b*-c
(0) uminus c
t1:=- c
(1) * b (0)
t2:=b * t1
(2) uminus c
t3:=- c
t4:=b * t3 (3) * b (2)
t5:=t2 + t4 (4) + (1) (3)
a:=t5 (5) assign a (4)
Temporary names are not entered into the symbol table.
Other types of 3-address statements
e.g. ternary operations like
x[i]:=y x:=y[i]
require two or more entries. e.g.
op arg1 arg2
(0) []= x i
(1) assign (0) y
op arg1 arg2
(0) []= y i
(1) assign x (0)
Indirect Triples
a:=b*-c+b*-c op op arg1 arg2
(0) (14) (14) uminus c
t1:=- c
t2:=b * t1 (1) (15) (15) * b (14)
t3:=- c (2) (16) (16) uminus c
t4:=b * t3
t5:=t2 + t4 (3) (17) (17) * b (16)
a:=t5
(4) (18) (18) + (15) (17)
(5) (19) (19) assign a (18)
Assignment Statements
S -> id := E { ptr := lookup([Link]);
if ptr <> nil then
emit(ptr ‘:=‘ [Link])
else error}
E -> E1 + E2 { [Link] := newtemp;
emit([Link] ‘:=‘ [Link] ‘+’ [Link]) }
E -> E1 * E2 { [Link] := newtemp;
emit([Link] ‘:=‘ [Link] ‘*’ [Link]) }
E -> - E1 { [Link] := newtemp;
emit([Link] ‘:=‘ ‘uminus’ [Link])}
E -> ( E1 ) { [Link] = [Link] }
E -> id { ptr := lookup ([Link]);
if ptr <> nil then
[Link] = ptr;
else error}
Reusing temporaries
A simple algorithm:
Say we have a counter c, initialized to zero
Whenever a temporary name is used, decrement c by 1
Whenever a new temporary name is created, use $c and increment c by
1
E.g.: x := a*b + c*d – e*f
Statement Value of C
----------------------------------------------
0
$0 := a*b ; 1 (c incremented by 1)
$1 := c*d ; 2 (c incremented by 1)
$0 := $0 + $1 ; 1 (c decremented twice, incremented once)
$1 := e * f ; 2 (c incremented by 1)
$0 := $0 -$1 ; 1 (c decremented twice, incremented once)
x := $0 ; 0 (c decremented once)