1. Discuss the following in detail about the Syntax Directed Definitions.
a. Inherited attributes.
b. Synthesized attributes.
Syntax Directed Definitions (SDD)
Syntax Directed Definitions (SDD) are formal mechanisms that
combine context-free grammar rules with attributes and semantic
rules.
Each grammar symbol can have attributes, and each production rule
can have associated semantic rules to define how attribute values are
computed.
Types of Attributes in SDD:
1. Synthesized Attributes
● A synthesized attribute is an attribute whose value at a node is
determined only from the attribute values of its children in the parse
tree.
● Information flows upward from children to parent nodes.
Example:
Production: E → E1 + T
Semantic Rule: [Link] = [Link] + [Link]
2. Inherited Attributes
● An inherited attribute is an attribute whose value at a node is
determined from its parent or its siblings in the parse tree.
● Information flows downward or sideways from parent or sibling
nodes.
Example:
Production: A → B C D
Semantic Rules:
[Link] = [Link]
[Link] = [Link]
2. Write the syntax-directed translation scheme for declaration statement.
Construct the symbol table for processing the following PASCAL declaration
statements.
a: integer
b: real
c: array(10) of integer
d: ↑ integer
1. Syntax-Directed Translation Scheme for Declaration Statements:
Grammar:
D → id : T
T → integer | real | array(num) of integer | ↑ integer
Semantic Rules:
● For D → id : T
○ Create a new entry in the symbol table with:
■ name = [Link]
■ type = [Link]
■ offset = current_offset
○ Increment current_offset by the width of [Link]
● For T → integer
○ [Link] = integer
○ [Link] = 4
● For T → real
○ [Link] = real
○ [Link] = 8
● For T → array(num) of integer
○ [Link] = array(num, integer)
○ [Link] = num × 4
● For T → ↑ integer
○ [Link] = pointer to integer
○ [Link] = 4
(We assume sizes: integer = 4 bytes, real = 8 bytes, pointer = 4 bytes.)
[Link], for the given Pascal Declarations:
a: integer
b: real
c: array(10) of integer
d: ↑ integer
Let's construct the symbol table step-by-step:
● a is an integer:
○ Type = integer
○ Width = 4
○ Offset = 0
● b is a real:
○ Type = real
○ Width = 8
○ Offset = 0 + 4 = 4
● c is an array of 10 integers:
○ Type = array(10) of integer
○ Width = 10 × 4 = 40
○ Offset = 4 + 8 = 12
● d is a pointer to integer:
○ Type = pointer to integer
○ Width = 4
○ Offset = 12 + 40 = 52
Symbol Table:
Name Type Offset Width
(bytes)
a integer 0 4
b real 4 8
c array(10) of 12 40
integer
d pointer to 52 4
integer
[Link] the various storage allocation strategies in detail.
When a program executes, it needs memory to store code, data, and
intermediate results. Storage allocation is the process of assigning memory areas
to different parts of a program.
There are three main storage allocation strategies:
1. Static Storage Allocation
● Definition: Memory for all variables is allocated at compile time.
● Working:
○ Each name is bound to a fixed storage location throughout the
program's execution.
○ Memory allocation and deallocation occur only once.
● Features:
○ Supports global variables and constants.
○ Does not support dynamic data structures (like linked lists).
● Advantages:
○ Fast access (memory addresses known at compile time).
○ Simple memory management.
● Disadvantages:
○ Recursion is not supported (because local variables cannot be
dynamically created).
○ Fixed size: The size of variables must be known at compile time.
● Example: Languages like FORTRAN heavily used static storage allocation.
2. Stack Storage Allocation
● Definition: Memory is organized as a stack where memory is allocated and
deallocated in last-in, first-out (LIFO) order.
● Working:
○ When a procedure/function is called, an activation record (stack
frame) is pushed onto the stack.
○ When the procedure ends, the activation record is popped.
● Features:
○ Supports local variables and recursive function calls.
○ Each call gets fresh storage for its variables.
● Advantages:
○ Supports recursion naturally.
○ Memory usage is efficient for procedures/functions.
● Disadvantages:
○ Only suitable for temporary or local data.
○ Cannot directly manage dynamic memory needs beyond procedure
lifetime.
3. Heap Storage Allocation
● Definition: Memory is allocated and deallocated dynamically at runtime,
from a region called the heap.
● Working:
○ Memory can be requested and released at any time and any place.
○ Used for variables whose lifetimes are not known until runtime (e.g.,
dynamic data structures like linked lists, trees).
● Features:
○ Supports dynamic memory allocation (malloc, free in C).
○ Memory is manually managed by the programmer (or automatically in
languages with garbage collection).
● Advantages:
○ Highly flexible memory management.
○ Suitable for programs with unpredictable or variable memory needs.
● Disadvantages:
○ Memory fragmentation may occur.
○ Slower access compared to stack (due to management overhead).
○ Risk of memory leaks if memory is not properly deallocated.
4. Illustrate the storage organization memory in the perspective of compiler writer
with neat diagram
Storage Organization in the Perspective of Compiler Writer
When a compiler generates code for a program, it assumes the program
runs inside a logical address space.
This memory space is divided into different regions to manage the
code, data, and runtime requirements efficiently.
The main subdivisions of runtime memory are:
1. Code Area (Text Segment)
● Contains the compiled machine code (the instructions to be executed).
● This area is usually static (fixed size and read-only).
2. Static Data Area
● Holds global variables and static variables.
● Memory is allocated at compile time and remains throughout program
execution.
3. Heap Area
● Used for dynamic memory allocation (e.g., objects, linked lists, trees).
● Grows upwards (toward higher memory addresses).
● Allocated at runtime using functions like malloc(), new, etc.
4. Stack Area
● Used to store:
○ Function call information (activation records).
○ Local variables.
○ Parameters passed to functions.
● Grows downwards (toward lower memory addresses).
● Memory is allocated/deallocated automatically when functions are
called/returned.
5. Unused/Free Space
● Lies between the heap and the stack.
● Provides space for heap expansion and stack expansion during program
execution.
Neat Diagram of Memory Organization
High Address
+---------------------------+
| Stack | <-- Grows downward
| (Activation Records, etc.)|
+---------------------------+
| Free Space | (Gap between Heap and Stack)
+---------------------------+
| Heap | <-- Grows upward
| (Dynamic Allocations) |
+---------------------------+
| Static Data Area |
|(Global & Static Variables)|
+---------------------------+
| Code Area |
| (Program Instructions) |
+---------------------------+
Low Address
5. Show the step by step translation of the expression (a + b) * (c + d) using the
syntax-directed translation rules. Identify the synthesized and inherited attributes
from the grammar and justify the reason for its kind.
Expression: (a + b) * (c + d)
Given Grammar:
E → E + T { [Link] = [Link] + [Link] }
E → T { [Link] = [Link] }
T → T * F { [Link] = [Link] * [Link] }
T → F { [Link] = [Link] }
F → ( E ) { [Link] = [Link] }
F → id { [Link] = [Link] }
Syntax-Directed Translation Rules:
● Synthesized attributes:
Attributes computed from child nodes and passed upward.
(e.g., [Link] depends on [Link] and [Link])
● Inherited attributes:
Attributes passed from parent or sibling nodes downward or sideways.
(In this grammar, there are no inherited attributes, only synthesized.)
Input: (a + b) * (c + d)
The parse tree structure would look like:
E
/ \
E *
/ \ \
T + T
| / \
F T +
| | |
(E) F (E)
| |
id id
a c
Three Address Code (TAC) generated:
Assuming temporary variables like t1, t2, etc.,
Step Action Code Generated
1 t1 = a + b t1 = a + b
2 t2 = c + d t2 = c + d
3 t3 = t1 * t2 t3 = t1 * t2
Thus, the final result is stored in t3.
Identification of Attributes:
Attribute Type Where Used Reason
Synthesized [Link], [Link], [Link] Computed from child
nodes' values (bottom-up
flow).
Inherited None in this grammar No passing of values from
parent or sibling nodes.
6. Show how would you generate the intermediate code for the flow of control
statement? Explain with example.
In a programming language, flow of control statements like if-else,
while, and do-while control the execution sequence of code.
When generating intermediate code, we use conditional and unconditional jumps
(goto) to represent flow control.
To do this, we use:
● Labels: Points in code to jump to.
● Conditional jumps: if condition goto label
● Unconditional jumps: goto label
Basic Flow Control Statements and Their Intermediate Code
1. If-Then Statement
Grammar:
S → if E then S1
2. If-Then-Else Statement
Grammar:
S → if E then S1 else S2
Intermediate Code:
● Evaluate condition E.
● If E true, go to S1.
● Else, go to S2.
3. While-Do Statement
Grammar:
S → while E do S1
Intermediate Code:
● Mark the start of loop.
● Evaluate condition E.
● If E true, execute S1 and loop back.
● If E false, exit loop.
7. Represent x = (a + b) * (c - d) using 1) Three-address code 2) Quadruple 3)
Triple 4) Indirect Triple.
Given Expression:
x = (a + b) * (c - d)
1. Three-Address Code (TAC)
Three-address code breaks complex expressions into smaller steps using
temporaries:
t1 = a + b
t2 = c - d
t3 = t1 * t2
x = t3
Each statement has at most one operator and three addresses: two operands and
one result.
2. Quadruple Representation
In quadruples, each operation is a record with four fields:
(operator, arg1, arg2, result)
Index Operator Arg1 Arg2 Result
(0) + a b t1
(1) - c d t2
(2) * t1 t2 t3
(3) = t3 - x
3. Triple Representation
In triples, instead of using temporary variable names, we refer to instruction
numbers.
Index Operator Arg1 Arg2
(0) + a b
(1) - c d
(2) * (0) (1)
(3) = (2) x
Here (0), (1), etc. refer to earlier results (instruction numbers).
4. Indirect Triple Representation
In indirect triples, we maintain a pointer table that points to triples instead of
storing them directly.
Pointer Table:
Pointer Points to Instruction
0 (0)
1 (1)
2 (2)
3 (3)
Triple Instructions (same as above):
Index Operator Arg1 Arg2
(0) + a b
(1) - c d
(2) * (0) (1)
(3) = (2) x
8. Write the intermediate code of the following C program into
a. Syntax tree b. Postfix notation c. Three-address code representations
void main()
{
int i;
for (i=1; i<=10; i++)
{ printf("%d ", i); }
}
Given Program:
void main() {
int i;
for (i = 1; i <= 10; i++) {
printf("%d ", i);
}
}
a) Syntax Tree
FOR
/ / \ \
INIT COND INC BODY
i=1 i<=10 i=i+1 printf(i)
b) Postfix Notation
i 1 =
L1: i 10 <= if_false goto L2
printf i
i 1 + =
goto L1
L2:
c) Three-Address Code
i = 1
L1:
if i > 10 goto L2
param i
call printf
t1 = i + 1
i = t1
goto L1
L2:
9. State the purpose of activation tree during compilation. Develop an activation
tree for the C code to compute factorial of five.
Purpose of Activation Tree During Compilation
An activation tree represents the calling and returning sequence of
procedures/functions during the execution of a program.
Main purposes of Activation Tree:
● Shows function call relationships clearly (which function calls which).
● Represents nested activations:
○ If a function A calls B, and B calls C, it shows a tree structure.
● Helps in managing Control Stack:
○ Activation records (stack frames) are pushed and popped based on
this tree.
● Aids in understanding recursion:
○ Recursive calls appear as multiple activations of the same function.
int factorial(int n) {
if (n == 0)
return 1;
else
return n * factorial(n - 1);
}
void main() {
int result;
result = factorial(5);
}
Activation Tree for factorial(5)
main
|
factorial(5)
|
factorial(4)
|
factorial(3)
|
factorial(2)
|
factorial(1)
|
factorial(0)
10. Discuss briefly the issues in the design of code generator.
Issues in the Design of a Code Generator
The design of a code generator involves translating intermediate representations
(IR) of a program into machine code or assembly. Some common issues faced in
code generator design include:
1. Efficiency: The generated code should be optimized for speed and size,
meaning that the generator should minimize the number of instructions and
their execution time.
2. Correctness: The code generator must produce correct and reliable machine
code that faithfully represents the source program.
3. Target Architecture: It must take into account the constraints and features
of the target machine, such as register availability, instruction set
architecture (ISA), and addressing modes.
4. Register Allocation: Efficient usage of the available registers is crucial for
minimizing memory access and enhancing performance.
5. Error Handling: Handling errors, such as type mismatches, overflow, and
incorrect usage of registers or memory, must be robust.
6. Instruction Selection: Choosing the right instruction to implement a
particular operation can be challenging, as different machines might have
different instructions for the same task.
7. Optimizations: Identifying opportunities for instruction-level and loop-level
optimizations to reduce execution time.
11. Identify the basic block in the given code. How will you apply various
transformations to optimize the basic block?
int x = 7;
int y = 14;
int z = x + y;
int p = z * 2;
int q = p / 2;
return q;
Identifying Basic Blocks
In the given code:
int x = 7;
int y = 14;
int z = x + y;
int p = z * 2;
int q = p / 2;
return q;
We can break it down into the following basic blocks:
Block 1:
int x = 7;
int y = 14;
1. This block consists of initialization of variables x and y.
Block 2:
int z = x + y;
2. This block calculates the sum of x and y and stores it in z.
Block 3:
int p = z * 2;
3. This block multiplies z by 2 and stores the result in p.
Block 4:
int q = p / 2;
4. This block divides p by 2 and stores the result in q.
Block 5:
return q;
5. This block returns the value of q.
Applying Transformations to Optimize the Basic Block
Here are some optimization strategies that could be applied:
1. Constant Folding:
○ In the given code, we can observe that the value p / 2 can be
simplified. Since p = z * 2, we can rewrite p / 2 as (z * 2) /
2, which simplifies to just z.
○ This eliminates the need for the multiplication and division, reducing
the number of operations.
Optimized Code:
int x = 7;
int y = 14;
int z = x + y;
return z;
2. Strength Reduction:
○ The expression z * 2 can be optimized using addition, since
multiplying by 2 is the same as adding z to itself. However, after
constant folding, we no longer need this transformation because it
was simplified.
○
3. Dead Code Elimination:
○ In the original code, after z = x + y, the variable p is calculated but
then used only for division by 2. After the simplification via constant
folding, we no longer need p, which makes p dead code. So, the
computation of p can be completely removed.
4. Elimination of Redundant Operations:
○ We can eliminate unnecessary intermediate variables. For example,
we can directly calculate and return z without storing it in
intermediate variables like p or q.
Final Optimized Code:
After applying the transformations, the code is reduced to
int x = 7;
int y = 14;
int z = x + y;
return z;