NORMALIZATION
NORMALIZATION
i s a s y s t e m a t i c ap p ro ach
t o decompose ( break down)
t a b l e s t o e l i m i n a t e d a ta
r e dunda nc y( r e p e t i t i o n ) and
undesirable characteristics
l i k e In s e r t i o n anoma ly i n
DBMS, U p d at e a noma ly i n
DBMS, and De l e t e a noma ly
i n DBMS.
NORMALIZATION
N o r m a l i z a t i o n in D B M S is a t e c h n i q u e u s i n g w h i c h
y o u can o r g a n i z e the data in the d a t a b a s e ta b les
s o that:
• T he r e is l e s s repetition of data,
• A l a r g e set of data is s t r uct ur e d
into a b u n c h of s m a l l e r tables,
• a n d the t a b le s h a v e a p r o p e r
relationship b e t w e e n them.
WHY W E NEED
NORMALIZATION IN
DBMS?
DATA INTEGRITY
N o r m a l i z a t i o n h e l p s in ma inta ining data
integrity. It m i n i m i z e s the c h a n c e s of
i n c o n s i s t e n c i e s a nd d i s c r e p a n c i e s in the
data, e n s u r i n g that it a c c u r a t e l y r e p r e s e n t s
the real- w o r l d entities it is i n t e n d e d to
model
STORAGE OPTIMIZATION
N o r m a l i z a t i o n often l e a d s to m o r e efficient
use of storage space by eliminating
r e d u n d a n t data. This ca n be particularly
important in la rge d a t a b a s e s w h e r e s t o r a g e
c o s t s are significant.
DATA C O N S I S T E N C Y
Normalization ensures that data is
o r g a n i z e d in a c o n s i s t e n t a nd l o g i c a l
m a n n e r a c r o s s a d a t a b a s e or dataset.
This consistency helps in avoiding
a n o m a l i e s like data duplication, u p d a t e
a no ma lies, a nd deletio n anomalies.
IMPROVED QUERY PERFORMANCE
d a t a b a s e s g e n e r a l l y perf o rm better w h e n
e x e c u t i n g queries, as they require fewer
r e s o u r c e s to p r o c e s s a nd retrieve data.
This efficiency ca n translate into faster
r e s p o n s e ti m e s for a p p l i c a t i o n s rely ing
on the database.
PROBLEMS WITHOUT
NORMALIZATION IN DBMS
I f a table is not properly normalized
a n d h a s d a t a r e d u n d a n c y ( r epet ition )
t h e n i t w i l l n o t o n l y e a t u p e x tr a
m e m o r y s p a c e b u t w i l l a l s o m a k e it
difficult for you to h an d le a n d
u p d a t e t he d a t a in th e d a t a b a s e ,
w i t h o u t l o s i n g da ta .
T Y P E S OF D B M S N O R M A L
FORMS
N o r m a l i z a t i o n r u l e s a r e d i v i d e d into t h e
f o l l o w i n g n o r m a l fo rms:
1 . First N o r m a l F o r m
[Link] Normal Form
[Link] N o r m a l F o r m
4. Boyce-Codd
Normal F o r m (B C N F )
[Link] Normal Form
[Link] N o r m a l F o r m
F I R S T N O R M A L F O R M (1N F )
For a table to be in the First N o r m a l Form,
it s h o u l d f o llo w the f o l l o w i n g 4 rules:
1. It s h o u l d o n l y h a v e single ( atomic)
v a l u e d at t ribut e s/ c o l u m n s .
2. V a l u e s s t o r e d in a c o l u m n s h o u l d be of
the s a m e domain.
3. All the c o l u m n s in a table s h o u l d h a v e
u n i q u e name s.
4. A n d the orde r in w h i c h data is s t o r e d
s h o u l d not matter.
S E C O N D N O R M A L F O R M (2 N F )
F o r a t a b l e t o b e in t h e S e c o n d N o r m a l
Form,
1. It s h o u l d b e in t h e First N o r m a l form.
2 . A n d , it s h o u l d n o t h a v e P a r t i a l
Dependency.
3 . O c c u r s w h e n there is a C O M P O S I T E K E Y
W H A T IS PARTIAL D E P E N D E N C Y ?
W h e n a t a b l e h a s a p r i m a r y k e y that is m a d e u p
o f two o r m o r e c o l u m n s ( C o m p o s i t e Key), then
all t h e c o l u m n s ( not i n c l u d e d in t h e c o m p o s i t e
k e y ) in that table s h o u l d d e p e n d o n the entire
p r i m a r y k e y a n d not o n a part o f it. If any
c o l u m n ( w hi ch is not part o f c o m p o s i t e k e y )
d e p e n d s o n a part o f t h e p r i m a r y k e y t h e n we
s a y we h a v e Partial d e p e n d e n c y in the table.
T H I R D N O R M A L F O R M (3N F )
F o r a t a b l e t o b e in t h e T h i r d N o r m a l
Form,
[Link] s a t i s f i e s the First N o r m a l F o r m a n d the
S e c o n d N o r m a l form.
2 . A n d , it d o e s n ' t h a v e T r a n s i t i v e D e p e n d e n c y .
WHAT IS TRANSITIVE
DEPENDENCY?
I n a t a b l e w e h a v e s o m e c o l u m n that a c t s
a s the p r i m a r y k e y a n d o t h e r c o l u m n s
d e p e n d s o n this c o l u m n . B u t w h a t if a
c o l u m n that i s n o t t he p r i m a r y key
d e p e n d s o n a n o t h e r c o l u m n that i s a l s o
n o t a p r i m a r y k e y o r p art o f i t ? T h e n w e
h a v e T r a n s i t i v e d e p e n d e n c y in o u r table.
Score table
student_id subject_id marks exam_type total_marks
1 1 70 Theory 100
1 2 82 Theory 100
2 1 42 Practical 50
EXAM_TYPE table
exam_type total_marks
Theory 100
Lab 150
Practical 50
BOYCE-CODD NORMAL FORM
(B C N F )
• B o y c e and C o d d Normal F or m is a
higher version of the Third N o r m a l
For m.
• T h i s f o r m d e a l s with a c e r t a i n t y p e
o f a n o m a l y that i s n o t h a n d l e d b y
3NF.
• A 3 N F t a b l e that d o e s n o t h a v e
multiple overlapping candidate k e y s
i s s a i d to b e in B C N F .
F o r a t a b l e t o b e in B C N F , t h e f o l l o w i n g
c o n d i t i o n s m u s t b e satisfied:
1. it m u s t be in the 3 rd N o r m a l F o r m
2 . and, for e a c h f u n c t i o n a l d e p e n d e n c y ( X → Y ),
X s h o u l d be a S u p e r Key.
Y is a s u b s e t of X.
F O U R T H N O R M A L F O R M (4N F )
A t a b l e i s s a i d t o b e in t h e F o u r t h N o r m a l
F o r m when,
[Link] is in the B o y c e - C o d d N o r m a l F orm .
2 . A n d , it d o e s n ' t h a v e M u l t i - V a l u e d
Dependency.
F I F T H N O R M A L F O R M (5N F )
• T h e fifth n o r m a l form i s a l s o c a l l e d the
P J N F - Project-Join N o r m a l Form
• It i s the m o s t a d v a n c e d l e v e l of
D a t a b a s e Normalization.
• U s i n g Fifth N o r m a l F o r m y o u c a n fix
Join d e p e n d e n c y and r e d u c e data
redundancy.
• It a l s o h e l p s in fixing U p d a t e a n o m a l i e s
in D B M S d e s i g n .