Hawassa University
Department of Computer Science
Compiler Design
Course code: CoSc4072
By: Mekonen M.
Topics to be covered
Looping
• Different intermediate forms
• Different representation of Three Address code
Intermediate code representation
• Intermediate code which is also called Intermediate representation,
intermediate language is
• a kind of abstract machine language that can express the target machine operations
without committing too much machine details.
• Intermediate representation
• In a compiler, the front end translates a source program into an intermediate
representation, and the back end generates the target code from this intermediate
representation
• Language and Machine neutral
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 3
Why ICG
• The use of a machine independent intermediate code (IC) is:
• Portability - Suppose We have n-source languages and m-Target languages. Without
Intermediate code we will change each source language into target language directly. So, for
each source-target pair we will need a compiler. Hence we will require (n*m) Compilers, one for
each pair. If we Use Intermediate code We will require n-Compilers to convert each source
language into Intermediate code and m-Compilers to convert Intermediate code into m-target
languages. Thus We require only (n+m) Compilers.
C SPARC C SPARC
Pascal HP PA Pascal HP PA
IR
FORTRAN x86 FORTRAN x86
C++ IBM PPC
C++ IBM PPC
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 4
Why IR…
• Retargeting : retargeting to another machine is facilitated
• Optimization: the optimization can be done on the machine independent code
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 5
Different intermediate forms
• Different forms of intermediate code are:
1. Abstract syntax tree
2. Postfix notation
3. Three address code
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 7
Abstract syntax tree & DAG
• A syntax tree depicts the natural hierarchical structure of a source program. It
retain essential structure of the parse tree, eliminating unneeded nodes
• A DAG (Directed Acyclic Graph) gives the same information but in a more compact
way because common sub-expressions are identified.
• Ex: a=b*-c+b*-c = =
+
a +
a
* * *
b uminus b uminus b uminus
c c Syntax Tree c DAG
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 8
Postfix Notation
• Postfix notation is a linearization of a syntax tree.
• In postfix notation the operands occurs first and then operators are arranged.
• Ex: (A + B) * (C + D)
Postfix notation: A B + C D + *
• Ex: (A + B) * C
Postfix notation: A B + C *
Postfix notation: A B * C D * +
• Ex: (A * B) + (C * D)
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 9
Three address code
• Three address code is a sequence of statements of the general form,
a:= b op c
• Where a, b or c are the operands that can be names or constants and op stands
for any operator.
• Example: a = b + c + d
t1=b+c
t2=t1+d
a= t2
• Here t1 and t2 are the temporary names generated by the compiler.
• There are at most three addresses allowed (two for operands and one for result).
Hence, this representation is called three-address code.
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 10
Types of three address code
Statement Format Comments
Assignment (binary operation) X := Y op Z Arithmetic and logical operators used
Assignment (unary operation) X := op Y Unary -, not, conversion operators used
Copy statement X := Y
Unconditional jump goto L
Conditional jump If X relop y goto L
Function call param X1 The parameters are specified by param
param X2 The procedure p is called by indicating the
… number of parameters n
param Xn
call p, n
Indexed arguments X := Y [I] X will be assigned the value at the address Y + I
Y[I] := X The value at the address Y + I will be assigned X
Address and pointer assignments X := & Y X is assigned the address of Y
X := *Y X is assigned the element at the address Y
*X = Y The value at the address X is assigned Y
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 11
Different Representation of Three Address Code
• There are three types of representation used for three address code:
1. Quadruples
2. Triples
3. Indirect triples
• Ex: x= -a*b + -a*b
t1= - a
t2 = t1 * b
t3= - a Three Address Code
t4 = t3 * b
t5 = t2 + t4
x= t5
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 13
Quadruple
• The quadruple is a structure with at the most four fields such as op, arg1, arg2 and result.
• The op field is used to represent the internal code for operator.
• The arg1 and arg2 represent the two operands.
• And result field is used to store the result of an expression.
Quadruple
No. Operator Arg1 Arg2 Result
x= -a*b + -a*b
t1= - a (0) uminus a t1
t2 = t1 * b (1) * t1 b t2
t3= - a (2) uminus a t3
t4 = t3 * b (3) * t3 b t4
t5 = t2 + t4
(4) + t2 t4 t5
x= t5
(5) = t5 x
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 14
Triple
• To avoid entering temporary names into the symbol table, we might refer a
temporary value by the position of the statement that computes it.
• If we do so, three address statements can be represented by records with only
three fields: op, arg1 and arg2.
Quadruple Triple
No. Operator Arg1 Arg2 Result No. Operator Arg1 Arg2
(0) uminus a t1 (0) uminus a
(1) * t1 b t2 (1) * (0) b
(2) uminus a t3 (2) uminus a
(3) * t3 b t4 (3) * (2) b
(4) + t2 t4 t5 (4) + (1) (3)
(5) = t5 x (5) = x (4)
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 15
Indirect Triple
• In the indirect triple representation the listing of triples has been done. And listing
pointers are used instead of using statement.
• This implementation is called indirect triples.
Triple Indirect Triple
No. Operator Arg1 Arg2 Statement No. Operator Arg1 Arg2
(0) uminus a (0) (14) (0) uminus a
(1) * (0) b (1) (15) (1) * (14) b
(2) uminus a (2) (16) (2) uminus a
(3) * (2) b (3) (17) (3) * (16) b
(4) + (1) (3) (4) (18) (4) + (15) (17)
(5) = x (4) (5) (19) (5) = x (18)
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 16
Exercise
Write quadruple, triple and indirect triple for following:
1. -(a*b)+(c+d)
2. a*-(b+c)
For following questions:
I. Construct Abstract syntax tree
II. convert to postfix notion
III. Write quadruple, triple and indirect triple for following:
1. x=(a+b*c)^(d*e)+f*g^h
2. g+a*(b-c)+(x-y)*d
Mekonen M. # CoSc4072 Unit 5 – Intermediate Code Generation 17