0% found this document useful (0 votes)

18 views13 pages

Understanding Database Normalization Techniques

Normalization is a process used to minimize redundancy in database relations, helping to avoid anomalies during insertion, deletion, and updates. It involves organizing data into tables and ensuring that each table adheres to specific normal forms, such as First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and Boyce-Codd Normal Form (BCNF). Each normal form addresses different types of dependencies and aims to logically store data while eliminating redundancy.

Uploaded by

bhosaleharshvardhan023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views13 pages

Understanding Database Normalization Techniques

Uploaded by

bhosaleharshvardhan023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Normalization is the process of minimizing redundancy from a relation or set of relations.

Redundancy in relation may cause insertion, deletion and updation anomalies. So, it helps to
minimize the redundancy in relations. Normal forms are used to eliminate or reduce redundancy
in database tables.

Database Normalization is a technique of organizing the data in the database. Normalization is a

systematic approach of decomposing tables to eliminate data redundancy (repetition) and
undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step
process that puts data into tabular form, removing duplicated data from the relation tables.

Normalization is used for mainly two purposes,

 Eliminating redundant (useless) data.

 Ensuring data dependencies make sense i.e data is logically stored.

Problems Without Normalization

If a table is not properly normalized and have data redundancy then it will not only eat up extra
memory space but will also make it difficult to handle and update the database, without facing data
loss. Insertion, Updation and Deletion Anomalies are very frequent if database is not normalized.
To understand these anomalies let us take an example of a Student table.
In the student table, we have data of 4 Computer Sci. students. As we can see, data for the
fields branch, hod (Head of Department) and office_tel is repeated for the students who are in the
same branch in the college, this is Data Redundancy.
rollno name branch hod office_tel

401 Akon CSE Mr. X 53337

402 Bkon CSE Mr. X 53337

403 Ckon CSE Mr. X 53337

404 Dkon CSE Mr. X 53337

Insertion Anomaly

 Suppose for a new admission, until and unless a student opts for a branch, data of the
student cannot be inserted, or else we will have to set the branch information as NULL.
 Also, if we have to insert data of 100 students of same branch, then the branch information
will be repeated for all those 100 students.
 These scenarios are nothing but Insertion anomalies.

Updation Anomaly

What if Mr. X leaves the college? or is no longer the HOD of computer science department? In
that case all the student records will have to be updated, and if by mistake we miss any record, it
will lead to data inconsistency. This is Updation anomaly
Deletion Anomaly

In our Student table, two different information’s are kept together, Student information and Branch
information. Hence, at the end of the academic year, if student records are deleted, we will also
lose the branch information. This is Deletion anomaly.

First Normal Form –

If a relation contain composite or multi-valued attribute, it violates first normal form or a relation
is in first normal form if it does not contain any composite or multi-valued attribute. A relation is
in first normal form if every attribute in that relation is singled valued attribute.
For a table to be in the First Normal Form, it should follow the following 4 rules:

1. It should only have single (atomic) valued attributes/columns.

2. Values stored in a column should be of the same domain

3. All the columns in a table should have unique names.

4. And the order in which data is stored, does not matter.

Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute

STUD_PHONE. Its decomposition into 1NF has been shown in table 2.
Example 2 –

ID Name Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3
In the above table Course is a multi-valued attribute so it is not in 1NF. Below Table is in 1NF
as there is no multi valued attribute
ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3
Second Normal Form (2NF)

For a table to be in the Second Normal Form,

 It should be in the First Normal form.

 And, it should not have Partial Dependency.

To be in second normal form, a relation must be in first normal form and relation must not
contain any partial dependency.
A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute (attributes
which are not part of any candidate key) is dependent on any proper subset of any candidate key
of the table.

 Partial Dependency – If the proper subset of candidate key determines non-prime

attribute, it is called partial dependency.

What is Dependency?

Let's take an example of a Student table with columns student_id, name, reg_no(registration
number), branch and address(student's home address).

student_id name reg_no branch address

In this table, student_id is the primary key and will be unique for every row, hence we can
use student_id to fetch any row of data from this table. Even for a case, where student names are
same, if we know the student_id we can easily fetch the correct record.

student_id name reg_no branch address

10 Akon 07-WY CSE Kerala

11 Akon 08-WY IT Gujarat

Hence we can say a Primary Key for a table is the column or a group of columns(composite key)
which can uniquely identify each record in the table.

I can ask from branch name of student with student_id 10, and I can get it. Similarly, if I ask for
name of student with student_id 10 or 11, I will get it. So all I need is student_id and every other
column depends on it, or can be fetched using it.

This is Dependency and we also call it Functional Dependency.

What is Partial Dependency?

Now that we know what dependency is, we are in a better state to understand what partial
dependency is. For a simple table like Student, a single column like student_id can uniquely identfy
all the records in a table.

But this is not true all the time. So now let's extend our example to see if more than 1 column
together can act as a primary key.

Let's create another table for Subject, which will have subject_id and subject_name fields
and subject_id will be the primary key.

subject_id subject_name

1 Java

2 C++

3 Php

Now we have a Student table with student information and another table Subject for storing
subject information.

Let's create another table Score, to store the marks obtained by students in the respective subjects.
We will also be saving name of the teacher who teaches that subject along with marks.
score_id student_id subject_id marks teacher

1 10 1 70 Java Teacher

2 10 2 75 C++ Teacher

3 11 1 80 Java Teacher

In the score table we are saving the student_id to know which student's marks are these
and subject_id to know for which subject the marks are for.

Together, student_id + subject_id forms a Candidate Key for this table, which can be
the Primary key.

Confused, How this combination can be a primary key?

See, if I ask you to get me marks of student with student_id 10, can you get it from this table?
No, because you don't know for which subject. And if I give you subject_id, you would not
know for which student. Hence we need student_id + subject_id to uniquely identify any row.

But where is Partial Dependency?

Now if you look at the Score table, we have a column names teacher which is only dependent on
the subject, for Java it's Java Teacher and for C++ it's C++ Teacher & so on.

Now as we just discussed that the primary key for this table is a composition of two columns which
is student_id & subject_id but the teacher's name only depends on subject, hence the subject_id,
and has nothing to do with student_id.

This is Partial Dependency, where an attribute in a table depends on only a part of the primary
key and not on the whole key.

How to remove Partial Dependency?

There can be many different solutions for this, but out objective is to remove teacher's name from
Score table.
The simplest solution is to remove columns teacher from Score table and add it to the Subject
table. Hence, the Subject table will become:

subject_id subject_name teacher

1 Java Java Teacher

2 C++ C++ Teacher

3 Php Php Teacher

And our Score table is now in the second normal form, with no partial dependency.

score_id student_id subject_id marks

1 10 1 70

2 10 2 75

3 11 1 80
Example – Consider table-3 as following below.

STUD_NO COURSE_NO COURSE_FEE

1 C1 1000

2 C2 1500

1 C4 2000

4 C3 1000

4 C1 1000

2 C5 2000

{Note that, there are many courses having the same course fee}

Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate
key {STUD_NO, COURSE_NO} ;
But, COURSE_NO -> COURSE_FEE , i.e., COURSE_FEE is dependent on COURSE_NO,
which is a proper subset of the candidate key. Non-prime attribute COURSE_FEE is dependent
on a proper subset of the candidate key, which is a partial dependency and so this relation is not
in 2NF.

To convert the above relation to 2NF,

we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000

NOTE: 2NF tries to reduce the redundant data getting stored in memory. For instance, if there
are 100 students taking C1 course, we dont need to store its Fee as 1000 for all the 100 records,
instead once we can store it in the second table as the course fee for C1 is 1000.
Example 2 – Consider following functional dependencies in relation R (A, B, C, D)
 AB -> C [A and B together determine C]

BC -> D [B and C together determine D]

In the above relation, AB is the only candidate key and there is no partial dependency, i.e.,
any proper subset of AB doesn’t determine any non-prime attribute.

Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

1. It is in the Second Normal form.

2. And, it doesn't have Transitive Dependency.

A relation is in third normal form, if there is no transitive dependency for non-prime attributes
as well as it is in second normal form.

A relation is in 3NF if at least one of the following condition holds in every non-trivial function
dependency X –> Y
1. X is a super key.
2. Y is a prime attribute (each element of Y is part of some candidate key).
Transitive dependency – If A->B and B->C are two FDs then A->C is called transitive
dependency.
Example 1 – In relation STUDENT given in Table 4,

FD set: {STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE,

STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}

Candidate Key: {STUD_NO}

For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true. So STUD_COUNTRY is transitively dependent on STUD_NO. It
violates the third normal form. To convert it in third normal form, we will decompose the
relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_COUNTRY_STUD_AGE) as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)

Example 2 – Consider relation R(A, B, C, D, E)

A -> BC,
CD -> E,
B -> D,
E -> A
All possible candidate keys in above relation are {A, E, CD, BC} All attribute are on right
sides of all functional dependencies are prime.

Boyce-Codd Normal Form (BCNF) –

A relation R is in BCNF if R is in Third Normal Form and for every FD, LHS is super key. A
relation is in BCNF iff in every non-trivial functional dependency X –> Y, X is a super key.

 Example 1 – Find the highest normal form of a relation R(A,B,C,D,E) with FD set
as {BC->D, AC->BE, B->E}
Step 1. As we can see, (AC)+ = {A, C,B,E,D} but none of its subset can determine
all attribute of relation, So AC will be candidate key. A or C can’t be derived from
any other attribute of the relation, so there will be only 1 candidate key {AC}.
Step 2. Prime attributes are those attribute which are part of candidate key {A, C} in
this example and others will be non-prime {B, D, E} in this example.
Step 3. The relation R is in 1st normal form as a relational DBMS does not allow
multi- valued or composite attribute.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not
a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is
candidate key) and B->E is in 2nd normal form (B is not a proper subset of
candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key
nor D is a prime attribute) and in B->E (neither B is a super key nor E is a prime
attribute) but to satisfy 3rd normal for, either LHS of an FD should be super key or
RHS should be prime attribute.
So the highest normal form of relation will be 2nd Normal form.
 Example 2 –For example consider relation R (A, B, C)
A -> BC,
B ->C
A and B both are super keys so above relation is in BCNF.
Key Points –
1. BCNF is free from redundancy.
2. If a relation is in BCNF, then 3NF is also also satisfied.
3. If all attributes of relation are prime attribute, then the relation is always in 3NF.
4. A relation in a Relational Database is always and at least in 1NF form.
5. Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
6. If a Relation has only singleton candidate keys( i.e. every candidate key consists of only 1
attribute), then the Relation is always in 2NF( because no Partial functional dependency
possible).
7. Sometimes going for BCNF form may not preserve functional dependency. In that case go
for BCNF only if the lost FD(s) is not required, else normalize till 3NF only.
8. There are many more Normal forms that exist after BCNF, like 4NF and more. But in real
world database systems it’s generally not required to go beyond BCNF.

Understanding Database Normalization
No ratings yet
Understanding Database Normalization
44 pages
Normal Form
No ratings yet
Normal Form
12 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
54 pages
12.1 Manupulating Data - Relational Data Base
No ratings yet
12.1 Manupulating Data - Relational Data Base
25 pages
Database Normalization Techniques Explained
No ratings yet
Database Normalization Techniques Explained
75 pages
DBMS Normalization Issues Explained
No ratings yet
DBMS Normalization Issues Explained
13 pages
Normalization and Functional Dependency in DBMS
No ratings yet
Normalization and Functional Dependency in DBMS
12 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
13 pages
Understanding Normalization in DBMS
No ratings yet
Understanding Normalization in DBMS
16 pages
Database Normalization: 1NF & 2NF Explained
No ratings yet
Database Normalization: 1NF & 2NF Explained
9 pages
Understanding 2nd Normal Form and Dependencies
100% (1)
Understanding 2nd Normal Form and Dependencies
36 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
6 pages
Database Normalization and SQL Basics
No ratings yet
Database Normalization and SQL Basics
66 pages
Relational Database Design Concepts
No ratings yet
Relational Database Design Concepts
15 pages
Fyds Dbms Module 2
No ratings yet
Fyds Dbms Module 2
22 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
41 pages
Relational Database Integrity Explained
No ratings yet
Relational Database Integrity Explained
32 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
18 pages
Understanding Database Normalization
100% (1)
Understanding Database Normalization
53 pages
Understanding DBMS Normalization Techniques
No ratings yet
Understanding DBMS Normalization Techniques
7 pages
Understanding Database Normalisation Techniques
No ratings yet
Understanding Database Normalisation Techniques
33 pages
Understanding Normalization in DBMS
No ratings yet
Understanding Normalization in DBMS
26 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
15 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
32 pages
Database Normalization in CS331
No ratings yet
Database Normalization in CS331
35 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
9 pages
Converting 1NF to 2NF in SQL
No ratings yet
Converting 1NF to 2NF in SQL
32 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
35 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
20 pages
Deletion Anomaly in Unnormalized Databases
No ratings yet
Deletion Anomaly in Unnormalized Databases
57 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
46 pages
Understanding Functional Dependency in DB Design
No ratings yet
Understanding Functional Dependency in DB Design
25 pages
Logical Data Concepts and Relationships
No ratings yet
Logical Data Concepts and Relationships
35 pages
Understanding Database Redundancy Issues
No ratings yet
Understanding Database Redundancy Issues
13 pages
Understanding DBMS Normalization Anomalies
No ratings yet
Understanding DBMS Normalization Anomalies
11 pages
FNL Data Download in DBMS Normalization
No ratings yet
FNL Data Download in DBMS Normalization
14 pages
Understanding DBMS Normalization Techniques
No ratings yet
Understanding DBMS Normalization Techniques
18 pages
Understanding DBMS Normalization Techniques
No ratings yet
Understanding DBMS Normalization Techniques
14 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
16 pages
Understanding Functional Dependency in Normalization
No ratings yet
Understanding Functional Dependency in Normalization
13 pages
Transitioning to Second Normal Form
No ratings yet
Transitioning to Second Normal Form
6 pages
Database Normalization and Dependencies
No ratings yet
Database Normalization and Dependencies
65 pages
Data Models and Database Concepts Explained
No ratings yet
Data Models and Database Concepts Explained
11 pages
Understanding RDBMS Concepts
No ratings yet
Understanding RDBMS Concepts
54 pages
Data Normalization Process Overview
No ratings yet
Data Normalization Process Overview
13 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
6 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
47 pages
Understanding Functional Dependency in DBMS
No ratings yet
Understanding Functional Dependency in DBMS
44 pages
Understanding Data Normalization Steps
No ratings yet
Understanding Data Normalization Steps
27 pages
Database Models and Normalization Techniques
No ratings yet
Database Models and Normalization Techniques
18 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
47 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
70 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
60 pages
Database Normalization and Schema Refinement
No ratings yet
Database Normalization and Schema Refinement
31 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
32 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
6 pages
Spatial Business Intelligence Course Outline
No ratings yet
Spatial Business Intelligence Course Outline
2 pages
IBM Data Analyst Professional Certificate
No ratings yet
IBM Data Analyst Professional Certificate
1 page
Principles of Records Management
100% (2)
Principles of Records Management
13 pages
Football Field Booking System Overview
No ratings yet
Football Field Booking System Overview
59 pages
Essential Excel Tricks for Data Analysts
No ratings yet
Essential Excel Tricks for Data Analysts
9 pages
Understanding Human-Computer Interaction
No ratings yet
Understanding Human-Computer Interaction
21 pages
Kofax Analytics for Capture Overview
No ratings yet
Kofax Analytics for Capture Overview
21 pages
Database Management Systems Overview
No ratings yet
Database Management Systems Overview
107 pages
Privacy & Security in Social Media Course
No ratings yet
Privacy & Security in Social Media Course
6 pages
Optimize Document Management with BizPortals 365
No ratings yet
Optimize Document Management with BizPortals 365
18 pages
Approaches to MIS Development Explained
100% (4)
Approaches to MIS Development Explained
2 pages
Overview of NoSQL Column Family Databases
No ratings yet
Overview of NoSQL Column Family Databases
159 pages
Laudon MIS14 ch02
No ratings yet
Laudon MIS14 ch02
40 pages
PL 300+HawkEye+Slides
No ratings yet
PL 300+HawkEye+Slides
60 pages
Mastering Python Console in QGIS
No ratings yet
Mastering Python Console in QGIS
13 pages
SQL Multiple Choice Questions Guide
No ratings yet
SQL Multiple Choice Questions Guide
5 pages
Vehicle Rental System Overview in Java
No ratings yet
Vehicle Rental System Overview in Java
17 pages
Database Management System Structure
No ratings yet
Database Management System Structure
4 pages
Enhancing CRM with Data Analytics
No ratings yet
Enhancing CRM with Data Analytics
97 pages
Teaching Infopreneurship Insights
100% (1)
Teaching Infopreneurship Insights
8 pages
Data Analyst Profile and Skills Summary
No ratings yet
Data Analyst Profile and Skills Summary
1 page
Project InSight: WRS for Accessibility App
No ratings yet
Project InSight: WRS for Accessibility App
11 pages
Internship Report: Junior Developer Experience
No ratings yet
Internship Report: Junior Developer Experience
44 pages
Data Modeling for Business Intelligence
No ratings yet
Data Modeling for Business Intelligence
3 pages
Mobile-Integrated Graveyard Locator System
No ratings yet
Mobile-Integrated Graveyard Locator System
5 pages
Student Enrollment List for UIE & AIT
No ratings yet
Student Enrollment List for UIE & AIT
6 pages
Database Foundations Midterm Exam Guide
No ratings yet
Database Foundations Midterm Exam Guide
14 pages
DP-203 Azure Data Engineering Cheat Sheet
No ratings yet
DP-203 Azure Data Engineering Cheat Sheet
87 pages
Evaluation of ZIM Database Product
No ratings yet
Evaluation of ZIM Database Product
10 pages
Understanding k Nearest Neighbours Algorithm
No ratings yet
Understanding k Nearest Neighbours Algorithm
9 pages

Understanding Database Normalization Techniques

Uploaded by

Understanding Database Normalization Techniques

Uploaded by

Normalization is the process of minimizing redundancy from a relation or set of relations.

Database Normalization is a technique of organizing the data in the database. Normalization is a

Normalization is used for mainly two purposes,

 Eliminating redundant (useless) data.

 Ensuring data dependencies make sense i.e data is logically stored.

Problems Without Normalization

401 Akon CSE Mr. X 53337

402 Bkon CSE Mr. X 53337

403 Ckon CSE Mr. X 53337

404 Dkon CSE Mr. X 53337

First Normal Form –

1. It should only have single (atomic) valued attributes/columns.

2. Values stored in a column should be of the same domain

3. All the columns in a table should have unique names.

4. And the order in which data is stored, does not matter.

Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute

For a table to be in the Second Normal Form,

 It should be in the First Normal form.

 And, it should not have Partial Dependency.

 Partial Dependency – If the proper subset of candidate key determines non-prime

student_id name reg_no branch address

student_id name reg_no branch address

10 Akon 07-WY CSE Kerala

11 Akon 08-WY IT Gujarat

This is Dependency and we also call it Functional Dependency.

What is Partial Dependency?

Confused, How this combination can be a primary key?

But where is Partial Dependency?

How to remove Partial Dependency?

subject_id subject_name teacher

1 Java Java Teacher

2 C++ C++ Teacher

3 Php Php Teacher

score_id student_id subject_id marks

STUD_NO COURSE_NO COURSE_FEE

To convert the above relation to 2NF,

BC -> D [B and C together determine D]

Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

1. It is in the Second Normal form.

2. And, it doesn't have Transitive Dependency.

FD set: {STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE,

STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}

Example 2 – Consider relation R(A, B, C, D, E)

Boyce-Codd Normal Form (BCNF) –

You might also like