0% found this document useful (0 votes)
16 views23 pages

Unit 3

Unit 3 of the DBMS course covers SQL Joins and Views, including types of joins such as Inner, Outer, and Natural Joins, along with their syntax and examples. It also discusses the concept of Views, how to create and manage them, and the importance of normalization in database design to reduce redundancy and anomalies. Key topics include functional dependencies, normalization forms, and the advantages of using views over tables.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views23 pages

Unit 3

Unit 3 of the DBMS course covers SQL Joins and Views, including types of joins such as Inner, Outer, and Natural Joins, along with their syntax and examples. It also discusses the concept of Views, how to create and manage them, and the importance of normalization in database design to reduce redundancy and anomalies. Key topics include functional dependencies, normalization forms, and the advantages of using views over tables.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

UNIT - 3 DBMS III SEM [Link].

UNIT – 3
SQL JOINS AND VIEWS
SQL Joins and Views: Inner Join, Natural Join, Full Outer Join, Left Outer Join, right outer Join, Equi
Join, Definition of View, creating a View, Managing Views (Listing, Updating, Deleting).
Normalization: Anomalies in relational database design. Functional dependencies - Axioms.
Decomposition, Transitive Dependency. Data Normalization: First normal form, Second normal form,
Third normal form. Boyce-Codd normal form.

SQL joins:
Join is an operation in DBMS(Database Management System) that combines the rows of two
or more tables based on related columns between them. The main purpose of join is to

queries. It is denoted by ⨝.
retrieve the data from multiple tables in other words Join is used to perform multi-table

Syntax
R3 <- ⨝(R1) <join_condition> (R2)
where R1 and R2 are two relations to be joined and R3 is a relation that will hold the result of
the join operation.
Example
Temp <- ⨝(student) [Link]=[Link](Exam)
where S and E are aliases of the student and exam respectively.
JOIN Example
Consider the two tables below as follows:

Table 1 – Student Table 2 - Student_Course


Both these tables are connected by one common key (column) i.e. ROLL_NO.
We can perform a JOIN operation using the given relational algebra:
Student ⨝ Student_course

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Types of SQL Joins

1) Inner Join
Inner Join is a join operation in DBMS that combines two or more tables based on related
columns and returns only rows that have matching values among tables. Inner join has two
types.

 Theta Join
 Conditional join
 Equi Join
 Natural Join

a) Theta Join
Theta join is more flexible than the inner join. It allows us to join tables based on any
condition, not just equality.
We can use any comparison operator such as >, <, >=, <=, or !=.
Here are the dbms joins with examples
Example:
Consider two tables, Employees and Departments:
Employees Table:
emp_id name salar dept_id
y
1 Divyansh 50000 10
2 Krish 60000 20
3 Neha 55000 30
Departments Table:
dept_id dept_nam min_salary
e
10 HR 45000
20 IT 55000

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

30 Sales 52000

SQL Query:
SELECT [Link], Departments.dept_name
FROM Employees
JOIN Departments
ON [Link] >= Departments.min_salary;
Output:
name dept_name
Divyansh HR
Krish IT
Neha Sales
Here, we get employees who meet or exceed the minimum salary requirement for their
department.

b) Conditional Join
Conditional join or Theta join is a type of inner join in which tables are combined based on
the specified condition.
In conditional join, the join condition can include <, >, <=, >=, ≠ operators in addition to the
'=' operator.
Example: Suppose two tables A and B
Table A
R S
10 5
7 20
Table B
T U
10 12

A ⨝ S<T B
17 6

Output
R S T U
1 5 10 12
0
Explanation: This query joins the table A, B and projects attributes R, S, T, U were the
condition S < T is satisfied.
c) Equi Join
Equi Join is a type of inner join where the join condition uses the equality operator ('=')
between columns.
Example: Suppose there are two tables Table A and Table C
Table A
Column Column B
A

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

a a
a b

Table C
Column Column B
A
a a

A ⨝ [Link] B = [Link] B (C)


a c

Output
Column A Column B
a a
Explanation: The data value "a" is available in both tables Hence we write that "a" is the
table in the given output.
d) Natural Join
Natural join is a type of inner join in which we do not need any comparison operators. In
natural join, columns should have the same name and domain. There should be at least one
common attribute between the two tables.
Example: Suppose there are two tables Table A and Table B
Table A
Number Square
2 4
3 9
Table B
Number Cube
2 8

A⨝B
3 27

Output
Number Square Cube
2 4 8
3 9 27
Explanation - Column Number is available in both tables Hence we write the "Number
column once " after combining both tables.
2) Outer Join
Outer join is a type of join that retrieves matching as well as non-matching records from
related tables. There are three types of outer join
 Left outer join
 Right outer join
 Full outer join

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

(a) Left Outer Join


It is also called left join. This type of outer join retrieves all records from the left table and
retrieves matching records from the right table.
Example: Suppose there are two tables Table A and Table B
Table A
Number Square
2 4
3 9
4 16
Table B
Number Cube
2 8
3 27

A⟕B
5 125

Output
Number Squar Cube
e
2 4 8
3 9 27
4 16 NULL
Explanation: Since we know in the left outer join we take all the columns from the left table
(Here Table A) In the table A we can see that there is no Cube value for number 4. so we
mark this as NULL.
(b) Right Outer Join
It is also called a right join. This type of outer join retrieves all records from the right table
and retrieves matching records from the left table. And for the record which doesn't lies in
Left table will be marked as NULL in result Set.

Right Outer Join


Example: Suppose there are two tables Table A and Table B
A⟖B
Output:

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Number Square Cube


2 4 8
3 9 27
5 NULL 125
Explanation: Since we know in the right outer join we take all the columns from the right
table (Here Table B) In table A we can see that there is no square value for number 5. So we
mark this as NULL.
(c) Full Outer Join
FULL JOIN creates the result set by combining the results of both LEFT JOIN and RIGHT
JOIN. The result set will contain all the rows from both tables. For the rows for which there is
no matching, the result set will contain NULL values.
Example: Table A and Table B are the same as in the left outer join
A⟗B
Output:
Number Squar Cube
e
2 4 8
3 9 27
4 16 NULL
5 NULL 125
Explanation: Since we know in full outer join we take all the columns from both tables (Here
Table A and Table B) In the table A and Table B we can see that there is no Cube value for
number 4 and No Square value for 5 so we mark this as NULL.

Definition of View
A view is a table whose rows are not explicitly stored, a view is a virtual table based on the
result-set of an SQL statement. A view can contain all rows of a table or select rows from a
table. A view can be created from one or many tables which depends on the written SQL
query to create a view.
A view is generated to show the information that the end-user requests the data according to
specified needs rather than complete information of the table.

Advantages of View over database tables


 Using Views, we can join multiple tables into a single virtual table.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

 Views hide data complexity.


 In the database, views take less space than tables for storing data because the database
contains only the view definition.
 Views indicate the subset of that data, which is contained in the tables of the database.

Creating Views
Database views are created using the CREATE VIEW statement. Views can be created from a
single table, multiple tables or another view.
To create a view, a user must have the appropriate system privilege according to the specific
implementation.
Syntax in Mysql
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Example:
CREATE VIEW Students_CSE AS
SELECT Roll_no,Name
FROM Students
WHERE Branch = 'CSE';
Key Terms:
 view_name: Name for the View
 table_name: Name of the table
 condition: Condition to select rows
Example 1: Creating a Simple View from a Single Table
Example 1.1: In this example, we will create a View named DetailsView from the
table StudentDetails.
Query:
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM StudentDetails
WHERE S_ID < 5;
Use the below query to retrieve the data from this view
SELECT * FROM DetailsView;

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Output:
Name Address
Harsh Kolkata
Ashish Durgapur
Pratik Delhi
Dhanraj Bihar

Example 1.2: Here, we will create a view named StudentNames from the table
StudentDetails.
Query:
CREATE VIEW StudentNames AS
SELECT S_ID, NAME
FROM StudentDetails
ORDER BY NAME;
If we now query the view as,
SELECT * FROM StudentNames;
Output:
S_ID Name
2 Ashish
4 Dhanraj
1 Harsh
3 Pratik
5 Ram

Example 2: Creating a View From Multiple Tables


In this example we will create a View MarksView that combines data from
bothtables StudentDetails and StudentMarks. To create a View from multiple tables we can
simply include multiple tables in the SELECT statement.
Query:
CREATE VIEW MarksView AS
SELECT [Link], [Link], [Link]
FROM StudentDetails, StudentMarks
WHERE [Link] = [Link];
To display data of View MarksView:
SELECT * FROM MarksView;
Output:
Name Address Marks

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Harsh Kolkata 90
Pratik Delhi 80
Dhanra Bihar 95
j
Ram Rajsthan 85

Managing Views: Listing, Updating, and Deleting


1. Listing all Views in a Database
We can list all the Views in a database, using the SHOW FULL TABLES statement or using
the information_schema table. A View can be created from a single table or multiple tables
USE "database_name";
SHOW FULL TABLES WHERE table_type LIKE "%VIEW";
Using information_schema
SELECT table_name
FROM information_schema.views
WHERE table_schema = 'database_name';

OR

SELECT table_schema, table_name, view_definition


FROM information_schema.views
WHERE table_schema = 'database_name';
2. Deleting a View
SQL allows us to delete an existing View. We can delete or drop View using the DROP
statement.
Syntax:
DROP VIEW view_name;
Example: In this example, we are deleting the View MarksView.\
DROP VIEW MarksView;
3. Updating a View Definition
If we want to update the existing data within the view, use the UPDATE statement.
UPDATE view_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];
If you want to update the view definition without affecting the data, use the CREATE OR
REPLACE VIEW statement. For example, let’s add the Age column to the MarksView:
CREATE OR REPLACE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Note: Not all views can be updated using the UPDATE statement.
Rules to Update Views in SQL:

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Certain conditions need to be satisfied to update a view. If any of these conditions


are not met, the view can not be updated.
1. The SELECT statement which is used to create the view should not include GROUP BY
clause or ORDER BY clause.
2. The SELECT statement should not have the DISTINCT keyword.
3. The View should have all NOT NULL values.
4. The view should not be created using nested queries or complex queries.
5. The view should be created from a single table. If the view is created using multiple
tables then we will not be allowed to update the view.

Inserting a row in a view


We can insert a row in a View in a same way as we do in a table. We can use the INSERT
INTO statement of SQL to insert a row in a View.
Syntax in Mysql
INSERT INTO view_name(column1, column2, ...)
VALUES(value1,value2,.....);

Deleting a row in a view


Deleting rows from a view is also as simple as deleting rows from a table. We can use the
DELETE statement of SQL to delete rows from a view.
Syntax in Mysql
DELETE FROM view_name
WHERE condition;
Example:
DELETE FROM Students_CSE
WHERE Name="ram";

Querying a View
We can query the view as follows
Syntax in Mysql
SELECT * FROM view_name
Example:
SELECT * FROM Students_CSE;

Dropping a View
In order to delete a view in a database, we can use the DROP VIEW statement.
Database migration tool
Syntax in Mysql
DROP FROM view_name
Example:
DROP FROM Students_CSE;

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Normalization:

Normalization is a systematic approach to organize data within a database to reduce


redundancy and eliminate undesirable characteristics such as insertion, update, and deletion
anomalies. The process involves breaking down large tables into smaller, well-structured ones
and defining relationships between them. This not only reduces the chances of storing
duplicate data but also improves the overall efficiency of the database.

Why we need Normalization in DBMS?


Normalization is required for,
 Eliminating redundant(useless) data, therefore handling data integrity, because if data is
repeated it increases the chances of inconsistent data.
 Normalization helps in keeping data consistent by storing the data in one table and
referencing it everywhere else.
 Storage optimization although that is not an issue these days because Database storage is
cheap.
 Breaking down large tables into smaller tables with relationships, so it makes the database
structure more scalable and adaptable.
 Ensuring data dependencies make sense i.e. data is logically stored.

Problems without Normalization in DBMS


If a table is not properly normalized and has data redundancy(repetition) then it will not
only eat up extra memory space but will also make it difficult for you to handle and update
the data in the database, without losing data.
Insertion, Updation, and Deletion Anomalies are very frequent if the database is not
normalized.
To understand these anomalies let us take an example of a Student table.
rollno name branch hod office_tel
401 Akon CSE Mr. X 53337
402 Bkon CSE Mr. X 53337
403 Ckon CSE Mr. X 53337
404 Dkon CSE Mr. X 53337
In the table above, we have data for four Computer Sci. students. As we can see, data for the
fields branch, hod(Head of Department), and office_tel are repeated for the students who are
in the same branch in the college, this is Data Redundancy.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Anomalies in relational database design


a) Insertion Anomalies: Insertion anomalies occur when it is not possible to insert data into
a database because the required fields are missing or because the data is incomplete. For
example, if a database requires that every record has a primary key, but no value is
provided for a particular record, it cannot be inserted into the database.

b) Deletion anomalies: Deletion anomalies occur when deleting a record from a database
and can result in the unintentional loss of data. For example, if a database contains
information about customers and orders, deleting a customer record may also delete all
the orders associated with that customer.

c) Updation anomalies: Updation anomalies occur when modifying data in a database and
can result in inconsistencies or errors. For example, if a database contains information
about employees and their salaries, updating an employee’s salary in one record but not in
all related records could lead to incorrect calculations and reporting.

Functional dependencies:
In relational database management, functional dependency is a concept that specifies the
relationship between two sets of attributes where one attribute determines the value of
another attribute. It is denoted as X → Y, where the attribute set on the left side of the arrow,
X is called Determinant, and Y is called the Dependent.

A functional dependency (FD) is a relationship between two attributes, typically between the
PK and other non-key attributes within a table. For any relation R, attribute Y is functionally
dependent on attribute X (usually the PK), if for every valid instance of X, that value of X
uniquely determines the value of Y. This relationship is indicated by the representation
below :
X ———–> Y
The left side of the above FD diagram is called the determinant, and the right side is
the dependent. Here are a few examples.
In the first example, below, SIN determines Name, Address and Birthdate. Given SIN, we can
determine any of the other attributes within the table.

SIN ———-> Name, Address, Birthdate

For the second example, SIN and Course determine the date completed (Date Completed).
This must also work for a composite PK.

SIN, Course ———> DateCompleted

What is Functional Dependency?


A functional dependency occurs when one attribute uniquely determines another attribute
within a relation. It is a constraint that describes how attributes in a table relate to each other.
If attribute A functionally determines attribute B we write this as the A→B.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Functional dependencies are used to mathematically express relations among database


entities and are very important to understanding advanced concepts in Relational Database
Systems.
Example: Consider a table with table name Students:
StudentID Name Department
101 Alice Computer
102 Bob Electrical
103 Alice Computer
Here, StudentID uniquely determines the Name and Department. This can be represented as:
 StudentID → Name
 StudentID → Department
Functional dependency is commonly used in database architecture and normalisation and is
vital for ensuring data consistency.
How to Denote a Functional Dependency in DBMS?
A functional dependency is denoted by an arrow “→”. The functional dependency
of A on B is represented by A → B.
Consider a relation with four attributes A, B, C and D,
R (ABCD)
1. A → BCD
2. B → CD
 For the first functional dependency A → BCD, attributes B, C and D are functionally
dependent on attribute A.
 Function dependency B → CD has two attributes C and D functionally depending
upon attribute B.
Sometimes everything on the left side of functional dependency is also referred to
as determinant set, while everything on the right side is referred to as depending attributes.
 Functional dependency can also be represented diagrammatically like this,

 Pointing arrows determines the depending attribute and the origin of the arrow
determines the determinant set.

Armstrong’s Axioms/Properties of Functional Dependency in DBMS

Axioms
Armstrong's Axioms refer to a set of inference rules, introduced by William W. Armstrong,
that are used to test the logical implication of functional dependencies. Given a set of
functional dependencies F, the closure of F (denoted as F+) is the set of all functional
dependencies logically implied by F. Armstrong's Axioms, when applied repeatedly, help
generate the closure of functional dependencies.
These axioms are fundamental in determining functional dependencies in databases and are
used to derive conclusions about the relationships between attributes.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Axioms

1. Axiom of Reflexivity
The Axiom of Reflexivity is the foundational principle stating that if you have a set of
attributes, a functional dependency exists between that set and itself. In simpler terms, it
means that any set of attributes functionally determines itself.
Example: In a student database, if we have an attribute 'Student_ID,' it is trivially true that
'Student_ID' determines 'Student_ID.'
If A is a set of attributes and B is a subset of A, then the functional dependency A → B holds
true.
For example, { Employee_Id, Name } → Name is valid.

2. Axiom of Augmentation
The Axiom of Augmentation tells us that if a functional dependency exists between two sets
of attributes, adding more attributes to both sides of the dependency does not change the
dependency.
1. If a functional dependency A → B holds true, then appending any number of the attribute
to both sides of dependency doesn't affect the dependency. It remains true.
o For example, X → Y holds true then, ZX → ZY also holds true.
o For example, if { Employee_Id, Name } → { Name } holds true then, { Employee_Id,
Name, Age } → { Name, Age }
Example:
If 'Student_ID' determines 'Student_Name,' then it also implies that 'Student_ID,
Course_Code' determines 'Student_Name, Course_Code.'

3. Axiom of Transitivity
The Axiom of Transitivity states that if we have two dependencies, where one attribute set
determines another, and the second set determines a third set, then we can infer that the first
set determines the third set.
2. If two functional dependencies X → Y and Y → Z hold true, then X → Z also holds true
by the rule of Transitivity.
o For example, if { Employee_Id } → { Name } holds true and { Name } → { Department
} holds true, then { Employee_Id } → { Department } also holds true.
Example:
If 'Student_ID' determines 'Course_Code' and 'Course_Code' determines 'Course_Name,' then
'Student_ID' determines 'Course_Name.'

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

Example:
Let’s assume the following functional dependencies:
{A} → {B}
{B} → {C}
{A, C} → {D}
1. Reflexivity: Since any set of attributes determines its subset, we can immediately infer the
following:
 {A} → {A} (A set always determines itself).
 {B} → {B}.
 {A, C} → {A}.
2. Augmentation: If we know that {A} → {B}, we can add the same attribute (or set of
attributes) to both sides:
 From {A} → {B}, we can augment both sides with {C}: {A, C} → {B, C}.
 From {B} → {C}, we can augment both sides with {A}: {A, B} → {C, B}.
3. Transitivity: If we know {A} → {B} and {B} → {C}, we can infer that:
 {A} → {C} (Using transitivity: {A} → {B} and {B} → {C}).
Although Armstrong's axioms are sound and complete, there are additional rules for
functional dependencies that are derived from them. These rules are introduced to simplify
operations and make the process easier.

Secondary Rules
In addition to the primary axioms, Armstrong also introduced several secondary rules:
1) Union
This rule suggests that if two tables are separate, and the PK is the same, you may want to
consider putting them together. It states that if X determines Y and X determines Z then X
must also determine Y and Z

For example, if:


 SIN —> EmpName
 SIN —> SpouseName
You may want to join these two tables into one as follows:
SIN –> EmpName, SpouseName
Some database administrators (DBA) might choose to keep these tables separated for a
couple of reasons. One, each table describes a different entity so the entities should be
kept apart. Two, if SpouseName is to be left NULL most of the time, there is no need to
include it in the same table as EmpName.

2) Decomposition
Decomposition is the reverse of the Union rule. If you have a table that appears to contain
two entities that are determined by the same PK, consider breaking them up into two
tables. This rule states that if X determines Y and Z, then X determines Y and X
determines Z separately

 Union: If A→B holds and A→C holds, then A→BC holds.


If X→Y and X→Z then X→YZ.
 Composition: If A→B and X→Y hold, then AX→BY holds.
 Decomposition: If A→BC holds then A→B and A→C hold.
If X→YZ then X→Y and X→Z.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

 Pseudo Transitivity: If A→B holds and BC→D holds, then AC→D holds.
If X→Y and YZ→W then XZ→W.
Example:
Let’s assume we have the following functional dependencies in a relation schema:
{A} → {B}
{A} → {C}
{X} → {Y}
{Y, Z} → {W}
Now, let's apply the Secondary Rules to derive new functional dependencies.
1. Union Rule: If A → B and A → C, then by the Union Rule, we can infer:
 A → BC This means if A determines both B and C, it also determines their
combination, BC.
2. Composition Rule: If A → B and X → Y hold, then by the Composition Rule, we can
infer:
 AX → BY
3. Decomposition Rule: If A → BC holds, then by the Decomposition Rule, we can
infer:
 A → B and A → C
4. Pseudo Transitivity Rule: If A → B and BC → D hold, then by the Pseudo
Transitivity Rule, we can infer:
 AC → D

Decomposition:
Decomposition in the context of database design refers to the process of breaking down a
single table into multiple tables in order to eliminate redundancy, reduce data anomalies,
and achieve normalization. Decomposition is typically done using rules defined by
normalization forms.
However, while decomposition can be helpful, it is not without challenges. Done
incorrectly, decomposition can lead to its own set of problems.
Decomposition in DBMS involves dividing a table into multiple tables, aiming to
eradicate redundancy, inconsistencies, and anomalies. This process, represented as {X1,
X2,……Xn}, ensures dependency preservation and losslessness. When a relational
model's relation lacks appropriate normal form, decomposition becomes necessary to
address issues like information loss, anomalies, and redundancy, ultimately enhancing the
overall design quality and efficiency of the database.
There are two types of decomposition as shown below:

Types of Decomposition
Decomposition is of two major types in DBMS:
 Lossless
 Lossy
1. Lossless Decomposition

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

A decomposition is said to be lossless when it is feasible to reconstruct the original


relation R using joins from the decomposed tables. It is the most preferred choice. This
way, the information will not be lost from the relation when we decompose it. A lossless

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
join would eventually result in the original relation that is very similar.

where ⋈ is a natural join operator


Example-Consider the following relation R( A , B , C )-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-

The two sub relations are-

A B
1 2
2 5
3 3
R1( A , B )

B C
2 1
5 3
3 3
R2( B , C )

Now, let us check whether this decomposition is lossless or not.

R1 ⋈ R2 = R
For lossless decomposition, we must have-

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-

A B C
1 2 1
2 5 3
3 3 3

2. Lossy Decomposition
 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

 This decomposition is called lossy join decomposition when the join of the sub relations
does not result in the same relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
For lossy join decomposition, we always have-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R

where ⋈ is a natural join operator

Example-Consider the following relation R( A , B , C )-

A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )

Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-

The two sub relations are-

A C
1 1
2 3
3 3
R1( A , B )

B C
2 1
5 3
3 3
R2( B , C )

Now, let us check whether this decomposition is lossy or not.

R1 ⋈ R2 ⊃ R
For lossy decomposition, we must have-

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-
A B C
1 2 1
2 5 3
2 3 3
3 5 3

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

3 3 3

Clearly, R1 ⋈ R2 ⊃ R.
This relation is not same as the original relation R and contains some extraneous tuples.

Thus, we conclude that the above decomposition is lossy join decomposition.

Data Normalization

What is Normalization in DBMS?


Normalization is the process of structuring data in a database. It involves creating tables and
defining relationships between them based on rules that safeguard the data and enhance the
database's flexibility by reducing redundancy and preventing inconsistent dependencies.
Normalization in a DBMS (database management system) eliminates data redundancy and
enhances data integrity in the table. It also helps organize the data in the database. This multi-
step process sets the data into tabular form and removes duplicate data from the relational
tables.
Normalization organizes the columns and tables of a database to ensure that database
integrity constraints properly execute their dependencies. It is a systematic technique of
decomposing tables to eliminate data redundancy (repetition) and undesirable characteristics
like Insertion, Update, and Deletion anomalies.

Normal Forms
There are four types of normal forms that are usually used in relational databases as you can
see in the following figure:

1. 1NF: A relation is in 1NF if all its attributes have an atomic value.


2. 2NF: A relation is in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the candidate key in DBMS.
3. 3NF: A relation is in 3NF if it is in 2NF and there is no transitive dependency.
4. BCNF: A relation is in BCNF if it is in 3NF and for every Functional Dependency,
LHS is the super key.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

1) First Normal Form (1NF)


A relation is in 1NF if every attribute is a single-valued attribute or it does not contain any
multi-valued or composite attribute, i.e., every attribute is an atomic attribute. If there is a
composite or multi-valued attribute, it violates the 1NF. To solve this, we can create a new
row for each of the values of the multi-valued attribute to convert the table into the 1NF.
Let’s take an example of a relational table <EmployeeDetail> that contains the details of the
employees of the company.

<EmployeeDetail>
Employee Code Employee Name Employee Phone Number
101 John 98765623,998234123
101 John 89023467
102 Ryan 76213908
103 Stephanie 98132452
Here, the Employee Phone Number is a multi-valued attribute. So, this relation is not in 1NF.
To convert this table into 1NF, we make new rows with each Employee Phone Number as a
new row as shown below:
<EmployeeDetail>
Employee Code Employee Name Employee Phone Number
101 John 998234123
101 John 98765623
101 John 89023467
102 Ryan 76213908
103 Stephanie 98132452

2) Second Normal Form (2NF)


The normalization of 1NF relations to 2NF involves the elimination of partial dependencies.
A partial dependency in DBMS exists when any non-prime attributes, i.e., an attribute not a
part of the candidate key, is not fully functionally dependent on one of the candidate keys.
For a relational table to be in second normal form, it must satisfy the following rules:
1. The table must be in first normal form.
2. It must not contain any partial dependency, i.e., all non-prime attributes are fully
functionally dependent on the primary key.
If a partial dependency exists, we can divide the table to remove the partially dependent
attributes and move them to some other table where they fit in well.
Let us take an example of the following <EmployeeProjectDetail> table to understand what is
partial dependency and how to normalize the table to the second normal form:
<EmployeeProjectDetail>
Employee Code Project ID Employee Name Project Name
101 P03 John Project103
101 P01 John Project101
102 P04 Ryan Project104
103 P02 Stephanie Project102
In the above table, the prime attributes of the table are Employee Code and Project ID. We
have partial dependencies in this table because Employee Name can be determined by
Employee Code and Project Name can be determined by Project ID. Thus, the above
relational table violates the rule of 2NF.
The prime attributes in DBMS are those which are part of one or more candidate keys.

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

To remove partial dependencies from this table and normalize it into second normal form, we
can decompose the <EmployeeProjectDetail> table into the following three tables:
<EmployeeDetail>
Employee Code Employee Name
101 John
101 John
102 Ryan
103 Stephanie

<EmployeeProject>
Employee Code Project ID
101 P03
101 P01
102 P04
103 P02
<ProjectDetail>
Project ID Project Name
P03 Project103
P01 Project101
P04 Project104
P02 Project102
Thus, we’ve converted the <EmployeeProjectDetail> table into 2NF by decomposing it into
<EmployeeDetail>, <ProjectDetail> and <EmployeeProject> tables. As you can see, the
above tables satisfy the following two rules of 2NF as they are in 1NF and every non-prime
attribute is fully dependent on the primary key.
The relations in 2NF are clearly less redundant than relations in 1NF. However, the
decomposed relations may still suffer from one or more anomalies due to the transitive
dependency. We will remove the transitive dependencies in the Third Normal Form.

3) Third Normal Form (3NF)


The first condition for a table to be in the Third Normal Form is that it should be in the
Second Normal [Link] second condition is that there should be no transitive dependency
for non-prime attributes, which indicates that non-prime attributes (not part of the candidate
key) should not depend on other non-prime attributes in a table. Therefore, a transitive
dependency is a functional dependency in which A → C (A determines C) indirectly because
of A → B and B → C (where it is not the case that B → A).
The Third Normal Form ensures the reduction of data duplication. It is also used to achieve
data integrity.
Example
To explain 3NF further, let's consider an example of a table that lists customer orders ?
Order Customer Customer Customer Order Order
ID ID Name City Date Total
1 100 John Smith New York 2022-01- 100
01
2 101 Jane Doe Los Angeles 2022-01- 200
02
3 102 Bob Johnson San Francisco 2022-01- 300
03

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

In this example, the non-primary key column "Customer City" is transitively dependent on
the primary key. That is, it depends on "Customer ID", which is not part of the primary key,
instead of depending directly on the primary key "Order ID". To bring this table to 3NF, we
can split it into two tables ?
Table 1: Customers
Customer ID Customer Name Customer City
100 John Smith New York
101 Jane Doe Los Angeles
102 Bob Johnson San Francisco

Table 2: Orders
Order ID Customer ID Order Date Order Total
1 100 2022-01-01 100
2 101 2022-01-02 200
3 102 2022-01-03 300
Now, the "Customer City" column is no longer transitively dependent on the primary key and
is instead in a separate table that has a direct relationship with the primary key. This makes
the table 3NF-compliant.

4) Boyce-Codd Normal Form (BCNF)


BCNF is a stricter form of 3NF that applies to tables with more than one candidate key.
BCNF requires that each non-trivial dependency in a table is a dependency on a candidate
key. This means that a table should not have non-trivial dependencies, where a non-primary
key column depends on another non-primary key column. BCNF ensures that each table in a
database is a separate entity and eliminates redundancies.

For a relational table to be in Boyce-Codd normal form, it must satisfy the following rules:
1. The table must be in the third normal form.
2. For every non-trivial functional dependency X -> Y, X is the super key of the table. That
means X cannot be a non-prime attribute if Y is a prime attribute.
A super key is a set of one or more attributes that can uniquely identify a row in a database
table.
Let us take an example of the following <EmployeeProjectLead> table to understand how to
normalize the table to the BCNF:
<EmployeeProjectLead>
Employee Code Project ID Project Leader
101 P03 Grey
101 P01 Christian
102 P04 Hudson
103 P02 Petro
The above table satisfies all the normal forms till 3NF, but it violates the rules of BCNF
because the candidate key of the above table is {Employee Code, Project ID}. For the non-
trivial functional dependency, Project Leader -> Project ID, Project ID is a prime attribute but
Project Leader is a non-prime attribute. This is not allowed in BCNF.
To convert the given table into BCNF, we decompose it into three tables:
<EmployeeProject>
Employee Code Project ID
101 P03
101 P01

ANNAPOORNA.M. S Assistant Professor Department BCA


UNIT - 3 DBMS III SEM [Link].

102 P04
103 P02
<ProjectLead>
Project Leader Project ID
Grey P03
Christian P01
Hudson P04
Petro P02

Thus, we’ve converted the <EmployeeProjectLead> table into BCNF by decomposing it into
<Employee Project> and <Project Lead> tables.

Advantages of Normalization
 Normalization eliminates data redundancy and ensures that each piece of data is stored in
only one place, reducing the risk of data inconsistency and making it easier to maintain
data accuracy.
 By breaking down data into smaller, more specific tables, normalization helps ensure that
each table stores only relevant data, which improves the overall data integrity of the
database.
 Normalization simplifies the process of updating data, as it only needs to be changed in
one place rather than in multiple places throughout the database.
 Normalization enables users to query the database using a variety of different criteria, as
the data is organized into smaller, more specific tables that can be joined together as
needed.
 Normalization can help ensure that data is consistent across different applications that use
the same database, making it easier to integrate different applications and ensuring that all
users have access to accurate and consistent data.

Disadvantages of Normalization
 Normalization can result in increased performance overhead due to the need for
additional join operations and the potential for slower query execution times.
 Normalization can result in the loss of data context, as data may be split across multiple
tables and require additional joins to retrieve.
 Proper implementation of normalization requires expert knowledge of database design
and the normalization process.
 Normalization can increase the complexity of a database design, especially if the data
model is not well understood or if the normalization process is not carried out correctly.

ANNAPOORNA.M. S Assistant Professor Department BCA

You might also like