0% found this document useful (0 votes)
370 views70 pages

VLSI CAD Flow: Logic Synthesis Insights

The document discusses the VLSI CAD flow from RTL design through logic synthesis, placement and routing. It covers topics like two-level and multi-level logic minimization, technology mapping using standard cell libraries, and dynamic programming approaches for finding optimal coverings to map logic networks to library gates. The goal is to optimize logic networks for area, delay, and power before generating the final gate-level netlist.

Uploaded by

xperiaash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
370 views70 pages

VLSI CAD Flow: Logic Synthesis Insights

The document discusses the VLSI CAD flow from RTL design through logic synthesis, placement and routing. It covers topics like two-level and multi-level logic minimization, technology mapping using standard cell libraries, and dynamic programming approaches for finding optimal coverings to map logic networks to library gates. The goal is to optimize logic networks for area, delay, and power before generating the final gate-level netlist.

Uploaded by

xperiaash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

VLSI CAD Flow: Logic Synthesis,

Placement and Routing

6.375 Lecture 5

Guest Lecture by Srini Devadas

1
RTL Design Flow
HDL

RTL manual
Synthesis design

netlist a
b
0
1
d
q

Library/ s clk

module logic
generators optimization

a 0 d
netlist b 1
q

s clk

physical
design

layout

2
Two-Level Logic Minimization

Can realize an arbitrary logic function in


sum-of-products or two-level form
F1 = A B + A B D + A B C D
+ABCD+AB+ABD

F1 = B + D + A C + A C

Of great interest to find a minimum sum-


of-products representation
– Solved problem even for functions with 100’s
of inputs (variants of Quine-McCluskey)
3
Two-Level versus Multilevel
2-Level:
f1 = AB + AC + AD
f2 = AB + AC + AE
6 product terms which cannot be shared.
24 transistors in static CMOS

Multi-level:

Note that B + C is a common term in f1 and f2

K=B+C 3 Levels
f1 = ΑΚ + AD 20 transistors in static CMOS
not counting inverters
f2 = AK + AE

4
Technologies
“Closed book”: gate-array
standard-cell
“Open book”: CMOS Domino,
complex gate static CMOS
LOGIC EQUATIONS

TECHNOLOGY-INDEPENDENT Factoring
OPTIMIZATION Commonality Extraction

TECH-DEPENDENT OPTIMIZATION LIBRARY


(MAPPING, TIMING)

OPTIMIZED LOGIC NETWORK

5
Tech.-Independent Optimization

Involves:
Minimizing two-level logic functions.
Finding common subexpressions.
Substituting one expression into another.
Factoring single functions.
Factored versus Disjunctive forms
f = ac + ad + bc + bd + a e
sum-of-products or disjunctive form

f = ( a + b )( c + d ) + a e
factored form
multi-level or complex gate

6
Optimizations

⎧ f1 = AB + AC + AD + AE + A BC D E
F=⎨
⎩ f2 = AB + AC + AD + AF + A BC D F

Factor F
⎧ f1 = A( B + C + D + E) + ABC DE
F=⎨
⎩ f2 = A( B + C + D + F) + ABC DF

Extract common expression


⎧ g1 = B + C + D
G = ⎨ f1 = A ( g1 + E ) + A E g1

⎩ f2 = A ( g1 + F ) + A F g1

7
What Does “Best” Mean?

Transistor count AREA


Number of circuits POWER
Number of levels DELAY
(Speed)

Need quick estimators of area, delay and power


which are also accurate

8
Algebraic vs. Boolean Methods
Algebraic techniques view equations as
polynomials and attempt to factor equations or
“divide” them
Do not exploit Boolean identities e.g., a a = 0

In algebraic substitution (or division) if a function


f = f(a, b, c) is divided by g = g(a, b), a and b
will not appear in f / g

Algebraic division: O(n log n) time


Boolean division: 2-level minimization required

9
Comparison
f = ab + ac + b a + bc + ca + cb
Algebraic factorization procedures
f = a( b + c ) + a ( b + c ) + b c + c b
Boolean factorization produces
f = ( a + b + c )( a + b + c )

l = ( b f + b f ) ( a + e ) + ae ( b f + bf )
r = ( b f + b f ) ( a + e ) + ae ( b f + bf )
Algebraic substitution of l into r fails
Boolean substitution
r = a ( e l + el ) + a ( el + e l )
l = a ( er + e r ) + a ( er + e r )
10
Strong (or Boolean) Division
Given a function f to be strong divided by g
Add an extra input to f corresponding to g,
namely G and obtain function h as follows

hDC = G g + Gg
hON = fON − hDC

Minimize h using two-level minimizer

11
Strong Division Example

f = a bc + a bc + a b c + a b c

g = a b +a b hDC = G (a b + a b) + G (a b + a b)

hON = fON − hDC

Ga
bc 00 01 11 10

00 x 1 x

01 1 x x Function h
11 x 1 x

10 x x 1

Minimization gives h = G c + G c
12
Weak (or Algebraic) Division

Definition: support of f as sup( f ) = { set of all


variables v that occur in f as v or v }
Example: f=AB+C
sup( f ) = { A, B, C }

Definition: we say that f is orthogonal to g,


f ⊥ g, if sup( f ) ∩ sup( g ) = φ

Example: f=A+B g=C+D

∴ f ⊥ g since { A, B } ∩ { C, D } = φ
13
Weak Division - 2

We say that g divides f weakly if there exist h, r


such that f = gh + r where h ≠ φ and g ⊥ h
Example: f = ab + ac + d
g=b+c
f = a(b + c) + d h=a r=d

We say that g divides f evenly if r = φ

The quotient f / g is the largest h such that


f = gh + r i.e., f = ( f / g )g + r

14
Weak Division Example
f = abc + abde + abh + bcd
g = c + de + h
Theorem: f / g = f / c ∩ f / de ∩ f / h
f / c = ab + bd
f / de = ab
f / h = ab

f / g = (ab + bd) ∩ ab ∩ ab = ab
f = ab(c + de + h) + bcd

Time complexity: O( | f | | g | )

15
How to Find Good Divisors?

$64K question

Strong division: Use existing nodes in the


multilevel network to simplify other nodes

Weak division: Generate good algebraic


divisors using algorithms based on “kernels”
of an algebraic expression

16
Tech.-Dependent Optimization

OPTIMIZED LOGIC EQUATIONS

LIBRARY TECHNOLOGY MAPPING


TIMING
CONSTRAINTS
GATE
NETLIST

Area, delay and power dissipation cost


functions

17
“Closed Book” Technologies

A standard cell technology or library is


typically restricted to a few tens of gates
e.g., MSU library: 31 cells
Gates may be NAND, NOR, NOT, AOIs.

A B
A
A C
A AB+C
A C
B

18
Mapping via DAG Covering

Represent network in canonical form


⇒ subject DAG
Represent each library gate with canonical
forms for the logic function
⇒ primitive DAGs
Each primitive DAG has a cost

Goal: Find a minimum cost covering of the


subject DAG by the primitive DAGs
Canonical form: 2-input NAND gates and
inverters

19
Sample Library

INVERTER 2

NAND2 3

NAND3 4

NAND4 5

20
Sample Library - 2

AOI21 4

AOI22 5

21
Trivial Covering

subject DAG

7 NAND2 = 21
5 INV = 10
31

22
Covering #1

2 INV =4
2 NAND2 =6
1 NAND3 =4
1 NAND4 =5
19
23
Covering #2

1 INV = 2
1 NAND2 = 3
2 NAND3 = 8
1 AOI21 = 4
17
24
DAG Covering
Sound Algorithmic approach
NP-hard optimization problem
multiple fanout

Tree covering heuristic: If subject and primitive


DAGs are trees, efficient algorithm can find
optimum cover in linear time
⇒ dynamic programming formulation

25
Partitioning a Graph

26
Resulting Trees

Break at multiple fanout points

27
Dynamic Programming

Principle of optimality: Optimal cover for a tree


consists of a match at the root of the tree
plus the optimal cover for the sub-trees
starting at each input of the match

x Best cover for


this match uses
p best covers for
y x, y, z

z Best cover for


this match uses
best covers for
p, z
28
Optimum Tree Covering

INV AOI21
11 + 2 = 13 4+3=7
NAND2
2 + 6 + 3 = 11

NAND2
INV 3+3=6
2

NAND2
3 NAND2
3

29
RTL Design Flow
HDL

RTL manual
Synthesis design

netlist a
b
0
1
d
q

Library/ s clk

module logic
generators optimization

a 0 d
netlist b 1
q

s clk

physical
design

layout
Physical Design: Overall Conceptual Flow
Input Read Netlist

Floorplanning Floorplanning

Initial Placement
Routing Region
Placement Definition
Placement
Global Routing Improvement

Cost Estimation

Routing Region
Ordering
Routing Routing
Detailed Routing Improvement

Cost Estimation

Compaction/clean-up
Output
Write Layout Database
Results of Placement

A bad placement A good placement

What’s good about a good placement?


What’s bad about a bad placement?
A. Kahng 3
Kurt Keutzer
Results of Placement

Bad placement causes routing Good placement


congestion resulting in: •Circuit area (cost) and wiring
• Increases in circuit area (cost) decreases
and wiring • Shorter wires Æ less capacitance
• Longer wires Æ more capacitance z Shorter delay
z Longer delay z Less dynamic power
dissipation
z Higher dynamic power
dissipation

4
Kurt Keutzer
Gordian Placement Flow

module coordinates
Global Partitioning
Optimization of the module set
minimization and dissection of
of the placement
wire length region
position constraints
module Regions
coordinates with ≤ k
Final modules
Placement
adoption of style
dependent
constraints

Data flow in the placement procedure GORDIAN


Complexity
space: O(m) time: Q( m1.5 log2m)
Final placement
•standard cell •macro-cell &SOG
Gordian: A Quadratic Placement Approach

• Global optimization:
solves a sequence of quadratic
programming problems
• Partitioning:
enforces the non-overlap constraints
Intuitive formulation

Given a series of points x1, x2, x3, … xn


and a connectivity matrix C describing the connections
between them
(If cij = 1 there is a connection between xi and xj)
Find a location for each xj that minimizes the total sum of
all spring tensions between each pair <xi, xj>

xi xj

Problem has an obvious (trivial) solution – what is it?


Improving the intuitive formulation

To avoid the trivial solution add constraints: Hx=b


z These may be very natural - e.g. endpoints (pads)
x1 xn

To integrate the notion of ``critical nets’’


z Add weights wij to nets

xi xj
wij - some
springs have
wij more tension
should pull
associated
vertices closer
Modeling the Net’s Wire Length

connection to
y other modules
module u
net
l vu v
node
(xu ,yu ) pin vu
(ξ , η
vu
)vu
(xv ,yv)
x
The length Lv of a net v is measured by the squared distances from its
points to the net’s center

Lv = ∑ [( x uv− x v ) 2 + ( y uv− yv )2]


u←Mv
( x uv = xu+ ξ uv ; yuv = yu + yvu )
x=100
Toy x=200

x1
Example: x2

Cost = (x1 − 100) 2 + (x 1 − x 2) 2 + (x 2 − 200) 2

Ÿ Cost = 2(x − 100) + 2(x − x )


Ÿx1 1 1 2

Ÿ Cost =− 2(x − x ) + 2(x − 200)


Ÿx2 1 2 2

setting the partial derivatives = 0 we solve for the minimum Cost:

Ax + B = 0

4 −2 x1 −200
−2 4 x 2 + −400 = 0

2 −1 x 1
x + −100
−200
=0
−1 2 2

x1=400/3 x2=500/3

10
Kurt Keutzer D. Pan
Quadratic Optimization Problem
D A B C D E F G
E ( uρ ,vρ )
' ' M ⎡M M M M M M M⎤
F ρ ⎢⎢* * * 0 0 0 ⎥
L⎥
A( l )=
A
B ρ ' ⎢0 0 0 * * * L⎥
( uρ ,vρ ) ⎢ ⎥
M ⎣M M M M M M M⎦
C

ƒ Linearly constrained quadratic programming problem

min{ Φ( x) = x TC x + d Tx } Accounts for fixed modules


x ∈R m
Wire-length for movable modules
s.t. A( l )x = u( l )
Center-of-gravity constraints
Problem is computationally tractable, and well behaved
Commercial solvers available: mostek
Global Optimization Using Quadratic
Placement
Quadratic placement clumps cells in center

Partitioning divides cells into two regions


z Placement region is also divided into two regions

New center-of-gravity constraints are added to the


constraint matrix to be used on the next level of global
optimization
z Global connectivity is still conserved
Setting up Global Optimization
Layout After Global Optimization

A. Kahng
Partitioning
Partitioning

In GORDIAN, partitioning is used to constrain the movement of


modules rather than reduce problem size

By performing partitioning, we can iteratively impose a new


set of constraints on the global optimization problem
z Assign modules to a particular block

Partitioning is determined by
z Results of global placement – initial starting point
z Spatial (x,y) distribution of modules

z Partitioning cost
z Want a min-cut partition

16
Kurt Keutzer
Layout after Min-cut

Now global placement problem will be solved again


with two additional center_of_gravity constraints
Adding Positioning Constraints

• Partitioning gives us two


new “center of gravity”
constraints

• Simply update constraint


matrix

• Still a single global


optimization problem

• Partitioning is not
“absolute”
• modules can migrate
back during optimization

• may need to re-partition


Continue to Iterate
First Iteration

A. Kahng
20
Kurt Keutzer
Second Iteration

A. Kahng
21
Kurt Keutzer
Third Iteration

A. Kahng
22
Kurt Keutzer
Fourth Iteration

A. Kahng
23
Kurt Keutzer
Final Placement
Final Placement - 1

Earlier steps have broken down the problem into a manageable


number of objects
Two approaches:
z Final placement for standard cells/gate array – row
assignment
z Final placement for large, irregularly sized macro-blocks –
slicing – won’t talk about this

25
Kurt Keutzer
Final Placement – Standard Cell Designs

This process continues until there are only a


few cells in each group( ≈ 6 )

each group
has ≤ 6 cells

Assign cells in each


group close together in
the same row or nearly
in adjacent rows

group: smallest partition

A. E. Dunlop, B. W. Kernighan,
A procedure for placement of standard-cell VLSI
circuits, IEEE Trans. on CAD, Vol. CAD-4, Jan , 1985,
pp. 92- 98
Final Placement – Creating Rows

1 1 1 1,2
1,2 1,2
1,2 2
2 2,3 2,3 Row-based
2,3
2,3 standard cell
3 3 3 design
3,4 3,4 3,4 3,4
4 4 4
4
5 5 4,5 4,5
5 5 5 5

Partitioning of circuit into 32 groups. Each group is


either assigned to a single row or divided into 2 rows

27
Kurt Keutzer
Standard Cell Layout

28
Kurt Keutzer
Another Series of Gordian

(a) Global placement with 1 region (b) Global placement with 4 region (c) Final placements

D. Pan – U of Texas
29
Kurt Keutzer
Physical Design Flow
Input Read Netlist

Floorplanning Floorplanning

Initial Placement

Routing Region
Placement Definition
Placement
Global Routing Improvement

Cost Estimation

Routing Region
Ordering
Routing Routing
Detailed Routing Improvement

Cost Estimation

Compaction/clean-up
Output
Courtesy K. Keutzer et al. UCB
Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 1 Write Layout Database
Imagine …

ƒ You have to plan transportation (i.e. roads and highways)


for a new city the size of Chicago
ƒ Many dwellings need direct roads that can’t be used by
anyone else
ƒ You can affect the layout of houses and neighborhoods
but the architects and planners will complain

ƒ And … you’re told that the time along any path can’t be
longer than a fixed amount

ƒ What are some of your considerations?

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 2
What are some of your considerations?

ƒ How many levels do my roads need to go? Remember:


Higher is more expensive.
ƒ How do I avoid congestion?
ƒ What basic structure do I want for my roads?
z Manhattan?
z Chicago?
z Boston?

ƒ Automated route tools have to solve problems of


comparable complexity on every leading edge chip

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 3
Routing Applications

Mixed
Mixed
Cell
Cell and
and Block
Block

Cell-based
Cell-based

Block-based
Block-based

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 4
Routing Algorithms
Hard to tackle high-level issues like congestion
and wire-planning and low level details of pin-
connection at the same time
ƒ Global routing
z Identify routing resources to be used
z Identify layers (and tracks) to be used
z Assign particular nets to these resources
z Also used in floorplanning and placement

ƒ Detail routing
z Actually define pin-to-pin connections
z Must understand most or all design rules

z May use a compactor to optimize result

z Necessary in all applications Kahng/Keutzer/Newton


ECE 260B – CSE 241A /UCB EECS 244 5
Basic Rules of Routing - 1
ƒ Wiring/routing
performed in layers –
5-9 (-11), typically
only in “Manhattan”
N/S E/W directions
z E.g. layer 1 – N/S
z Layer 2 – E/W

ƒ A segment cannot
cross another
segment on the same
wiring layer
ƒ Wire segments can
cross wires on other
layers
Photo courtesy:
Jan M. Rabaey
Anantha Chandrakasan
ƒ Power and ground
Borivoje Nikolic may have their own
layers
Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 6
Basic Rules of Routing – Part 2

ƒ Routing can be on a fixed grid –


ƒ Case 1: Detailed routing only in channels
z Wiring can only go over a row of cells when there is a
free track – can be inserted with a “feedthrough”
z Design may use of metal-1, metal-2
z Cells must bring signals (i.e. inputs, outputs) out to the
channel through “ports” or “pins”
Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 7
Basic Rules of Routing – Part 3

ƒ Routing can be on a fixed or gridless (aka area


routing)
ƒ Case 1: Detailed routing over cells
z Wiring can go over cells
z Design of cells must try to minimize obstacles to
routing – I.e. minimize use of metal-1, metal-2
z Cells do not need to bring signals (i.e. inputs, outputs)
out to the channel – the route will come to them
ECE 260B – CSE 241A /UCB EECS 244 8
Kahng/Keutzer/Newton
Taxonomy of VLSI Routers

Routers

Global Detailed Specialized

Graph Search Power & Ground


Restricted General Purpose
Steiner Clock
River Maze
Iterative
Switchbox Line Probe

Channel Line Expansion

Hierarchical Greedy Left-Edge

Kahng/Keutzer/Newton
Courtesy K. Keutzer et al. UCB
ECE 260B – CSE 241A /UCB EECS 244 9
Today’s high-perf logical/physical flow

Library user constraints


netlist tech
files

1) optimize using
estimated or logic delay
extracted optimization/ model
capacitances
timing verif generator
2) re-place and re-route
3)if design fails to meet
constraints due to placement SDF
poor estimation - cell/wire RC
repeat 1 +2- delays
routing

layout extraction

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 10
Top-down problems in the flow

Library user constraints


netlist tech
files
initial capacitance
estimates inaccurate
logic delay
optimization/ model
timing verif generator
inability to take top-
down timing
placement SDF
constraints
cell/wire RC
delays
routing
inaccurate internal
timing model
layout extraction

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 11
Iteration problems in the flow

Library user constraints


netlist tech
files
updated capacitances
cause significant
changes in logic delay
optimization optimization/ model
timing verif generator
limited-incremental
capability
placement SDF
cell/wire RC
delays
routing
resulting iteration may
not bring closer to layout extraction
convergence

Kahng/Keutzer/Newton
ECE 260B – CSE 241A /UCB EECS 244 12

You might also like