DBM
Unit - 1
Introduction to DBMS
Q1) Introduce dbms?
A1) Database
The database is a set of interrelated data used to efficiently
retrieve, insert and remove data. It is also used in the form of a
table, schema, views, and reports to organise the data, etc.
You can conveniently retrieve, attach, and delete information
using the database.
Database management system (DBMS)
A DBMS is a programme that enables database creation,
specification and manipulation, allowing users to easily store,
process and analyse information.
DBMS provides one with an interface or a tool to conduct different
operations, such as building databases, storing data in them,
updating data, creating database tables, and much more.
The DBMS also provides databases with privacy and security. In
the case of multiple users, it also ensures data consistency.
Some Example of DBMS :
● MySql
● Oracle
● SQL Server
● IBM DB2
Advantages of DBMS
● Controls database redundancy: It can control data
redundancy because it stores all the data in one single database
file and that recorded data is placed in the database.
● Data sharing: In DBMS, the authorized users of an
organization can share the data among multiple users.
● Easily Maintenance: It can be easily maintainable due to the
centralized nature of the database system.
● Reduce time: It reduces development time and maintenance
need.
● Backup: It provides backup and recovery subsystems which
create automatic backup of data from hardware and software
failures and restores the data if required.
● multiple user interface: It provides different types of user
interfaces like graphical user interfaces, application program
interfaces
Disadvantages of DBMS
● Cost of Hardware and Software: It requires a high speed of
data processor and large memory size to run DBMS software.
● Size: It occupies a large space of disks and large memory to
run them efficiently.
● Complexity: Database system creates additional complexity
and requirements.
● Higher impact of failure: Failure is highly impacted the
database because in most of the organization, all the data stored
in a single database and if the database is damaged due to
electric failure or database corruption then the data may be lost
forever.
Q2) Write the purpose of database system?
A2) It is a set of tools that allows a user to construct and manage
databases. In other words, it is general-purpose software that
enables users to define, construct, and manipulate databases for
a variety of applications.
Database systems are made to handle massive amounts of data.
Data management entails both the creation of structures for
storing data and the provision of methods for manipulating data.
Furthermore, despite system crashes or efforts at illegal access,
the database system must preserve the security of the
information stored. If data is to be shared across multiple users,
the system must avoid any unexpected outcomes.
Fig: Process of transforming data
The figure above depicts the process of
transforming data into information, knowledge,
and action in a database management system.
The file system was used as the foundation for
the database applications.
The goal of a database management system
(DBMS) is to transform the following:
1. Data into information.
2. Information into knowledge.
3. Knowledge of the action.
Uses of DBMS
The main uses of DBMS are as follows −
● Data independence and efficient access of data.
● Application Development time reduces.
● Security and data integrity.
● Uniform data administration.
● Concurrent access and recovery from crashes.
Q3) Write the characteristics of dbms?
A3) Characteristics of DBMS
● It stores and manages information in a digital repository
hosted on a server.
● It can provide a logical and transparent picture of the data
manipulation process.
● Automatic backup and recovery mechanisms are included in
the DBMS.
● It has ACID qualities, which keep data healthy in the event of
a failure.
● It has the ability to simplify complex data relationships.
● It's utilized to help with data manipulation and processing.
● It is used to ensure data security.
● It can examine the database from several perspectives
depending on the user's needs.
Q4) Write the application of database system?
A4) Data is a collection of linked facts and statistics that may be
processed to provide information, and a database is a collection
of related data.
The majority of data is made up of observable facts. Data assists
in the creation of fact-based content. For example, if we have
data on all students' grades, we can draw conclusions about top
performers and average grades.
A database management system maintains data in such a way
that retrieving, manipulating, and producing information becomes
easy. The following are some of the most notable DBMS
properties and applications.
● ACID Properties - Atomicity, Consistency, Isolation, and
Durability are all concepts that DBMS adheres to (normally
shortened as ACID). These principles are used to manipulate data
in a database through transactions. In multi-transactional
situations and in the event of failure, ACID features assist the
database stay healthy.
● Multiuser and Concurrent Access - The DBMS enables a
multi-user environment, allowing multiple users to access and
manipulate data at the same time. Although there are limitations
on transactions when many users attempt to handle the same
data item, the users are never aware of them.
● Multiple views - For different users, DBMS provides multiple
perspectives. A user in the Sales department will see the
database in a different way than someone in the Production
department. This feature allows users to get a focused view of the
database based on their specific needs.
● Security - Multiple views, for example, provide some security
by preventing users from accessing data belonging to other users
or departments. When entering data into a database and
retrieving it later, DBMS provides mechanisms to set limitations.
Multiple users can have distinct perspectives with various
functionalities thanks to DBMS's many different levels of security
features. A user in the Sales department, for example, cannot see
data from the Purchase department. It can also be controlled how
much data from the Sales department is displayed to the user.
Because DBMSs are not kept on disk like traditional file systems,
it is extremely difficult for criminals to crack the coding.
Q5) Write about data abstraction?
A5) Data Abstraction
● A major objective of database systems is to provide users
with an abstract view of the data.
● The system hides details of how the data is stored and
maintained.
● Database systems has to provide an efficient mechanism for
data retrieval.
● Efficiency leads to the design of complex data structure for
representation of data in databases.
● This complexity must be hidden from users.
● This can be done by providing several levels of abstractions.
● In the Three Schema Architecture, three levels of abstraction
have been created.
Fig: Data abstraction
These are: -
● Logical level : This is the middle stage of the architecture of
3-level data abstraction. This explains what information is
contained in the database.
● View level : Highest data abstraction standard. This level
defines the relationship between users and the database system.
● Physical level : This is the lowest data abstraction level. This
explains how the data in the database is actually stored. At this
step, you can get the complex information of the data structure.
Q6) Explain Database System Structure?
A6) The architecture of a database management system
influences its design. When dealing with a large number of PCs,
web servers, database servers, and other networked elements,
the same client/server architecture is used.
A client/server architecture is made up of several PCs and a
workstation that are all linked by a network.
The design of a database management system is determined by
how users link to the database in order to complete their
requests.
Fig: Types of
architecture
A single tier or
multi-tier
database
architecture can
be seen.
However, database design can be divided into two categories: 2-
tier architecture and 3-tier architecture.
1-tier Architecture :
The database is directly accessible to the user in this architecture.
It means that the user can sit on the DBMS and use it directly.
Any modifications made here will be applied directly to the
database. It does not provide end users with a useful tool.
For the creation of local applications, a 1-tier architecture is used,
in which programmers can interact directly with the database for
fast responses.
Fig: 1-tier architecture
2-tier Architecture :
The 2-tier architecture is similar to the basic
client-server architecture. Client-side applications can interact
directly with the database on the server side in a two-tier
architecture. APIs such as ODBC and JDBC are used for this
interaction.
The client-side runs the user interfaces and application programs.
The server side is in charge of providing features such as query
processing and transaction management. The client-side
application creates a link with the server side in order to
communicate with the DBMS.
Fig: 2-tier architecture
3-tier Architecture :
Between the client and the server is another layer in the 3-tier
architecture. The client cannot communicate directly with the
server in this architecture. The client-side program communicates
with an application server, which in turn communicates with the
database system.
Beyond the application server, the end user is unaware of the
database's presence. Aside from the submission, the database
has no knowledge about any other users.
In the case of a wide web application, the 3-tier architecture is
used.
Fig: 3-tier architecture
Q7) Describe relational
model?
A7) The primary data model is
the Relational Data Model, which
is commonly used for data
storage and processing around
the world. This model is simple
and has all the features and
functionality needed to process data with efficiency in storage.
The relational model can be interpreted as a table with rows and
columns. Each row is called a tuple. There's a name or attribute
for each table in the column.
Structure
A relational database is made up of several tables.
- Each table is given its own name.
– There are several rows in each table.
– Each row is a collection of data that are, by definition, related to
one another in some way; these values correspond to the table's
attributes or columns.
– Each table attribute has a set of authorized values for that
attribute; this set of permitted values is the attribute's domain.
Basic concept:
● Table: Relationships are saved in the format of tables in a
relational data model. The relationship between entities is stored
in this format. A table includes rows and columns, where rows
represent information, and attributes are represented by
columns.
● Tuple: A tuple is called a single row of a table, which contains
a single record for that relationship.
Attributes and Domains
● Domain: It includes a set of atomic values that can be
adopted by an attribute.
● Attribute: In a specific table, it includes the name of a column.
Every Ai attribute must have a domain, a domain (Ai)
Relations
● Relational instance: The relational example is represented in
the relational database structure by a finite set of tuples. There
are no duplicate tuples for relation instances.
● Relational schema: The name of the relationship and the
name of all columns or attributes are used in a relational schema.
● Relational key: Each row has one or more attributes in the
relational key. It can uniquely identify the row in the association.
Example: STUDENT Relation
NAME ROLL_N PHONE_N ADDRES AG
O O S E
Ram 14795 730575899 Noida 24
2
Shyam 12839 902628893 Delhi 35
6
Laxma 33289 858328718 Gurugra 20
n 2 m
Mahes 27857 708681913 Ghaziaba 27
h 4 d
Ganes 17282 9028 Delhi 40
h 9i3988
● In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS,
and AGE are the attributes.
● The instance of schema STUDENT has 5 tuples.
● t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Q8) Describe relational algebra operators ?
A8) Relational Algebra is procedural query language. It takes
Relation as input and generates relation as output. Relational
algebra mainly provides theoretical foundation for relational
databases and SQL.
Basic Operations which can be performed using relational algebra
are:
1. Projection
2. Selection
3. Join
4. Union
5. Set Difference
6. Intersection
7. Cartesian product
8. Rename
Consider following relation R (A, B, C)
A BC
1 11
2 31
4 13
2 34
5 45
1. Projection (π):
This operation is used to select particular columns from the
relation.
Π(AB) R :
It will select A and B column from the relation R. It produces the
output like:
AB
1 1
2 3
4 1
5 4
Project operation automatically removes duplicates from the
resultant set.
Example
∏subject, author (Books)
Selects and projects columns named as subject and author from
the relation Books.
2. Selection (σ):
This operation is used to select particular tuples from the relation.
σ(C<4) R:
It will select the tuples which have value of c less than 4.
But select operation is used to only select the required tuples
from the relation. To display those tuples on screen, select
operation must be combined with project operation.
Π (σ (C<4) R) will produce the result like:
A BC
1 11
2 31
4 13
Example
σ sales > 50000 (Customers)
Output – Selects tuples from Customers where sales is greater
than 5000
3. Join
A Cartesian product followed by a selection criterion is basically a
joint process.
Operation of join, denoted by ⋈
The JOIN operation often allows tuples from different relationships
to join different tuples
Types of JOIN:
Various forms of join operation are:
Inner Joins:
● Theta join
● EQUI join
● Natural join
Outer join:
● Left Outer Join
● Right Outer Join
● Full Outer Join
4. Union
Union operation in relational algebra is the same as union
operation in set theory, only the restriction is that both relations
must have the same set of attributes for a union of two
relationships.
Syntax : table_name1 ∪ table_name2
For instance, if we have two tables with RegularClass and
ExtraClass, both have a student column to save the student
name, then,
∏Student(RegularClass) ∪ ∏Student(ExtraClass)
The above operation will send us the names of students who
attend both normal and extra classes, reducing repetition.
5. Set difference
Set Difference in relational algebra is the same operation of set
difference as in set theory, with the limitation that the same set of
attributes can have both relationships.
Syntax: A - B
Where the A and B relationships .
For instance, if we want to find the names of students who attend
the regular class, but not the extra class, then we can use the
following procedure:
∏Student(RegularClass) - ∏Student(ExtraClass)
Example
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books
but not articles.
6. Intersection
The symbol ∩ is the definition of an intersection.
A∩B
Defines a relationship consisting of a set of all the tuples in both A
and B. A and B must be union-compatible, however.
7. Cartesian product
This is used to merge information into one from two separate
relationships(tables) and to fetch information from the merged
relationship.
Syntax : A X B
For example, if we want to find the morning Regular Class and
Extra Class data, then we can use the following operation:
σtime = 'morning' (RegularClass X ExtraClass)
Both RegularClass and ExtraClass should have the attribute time
for the above query to operate.
8. Rename
Rename is a unified operation that is used to rename relationship
attributes.
ρ (a/b)R renames the 'b' component of the partnership to 'a'.
Syntax : ρ(RelationNew, RelationOld)
Q9) What is tuple relational calculus?
A9) Tuple Relational Calculus
● To select the tuples in a relationship, the tuple relational
calculus is defined. The filtering variable in TRC uses relationship
tuples.
● You may have one or more tuples as a consequence of the
partnership.
Notation : {T | P (T)} or {T | Condition (T)}
Where,
T - resulting tuple
P (T) - condition used to fetch T.
Example 1:
{ [Link] | Author(T) AND [Link] = 'database' }
Output :
Select tuples from the AUTHOR relationship in this question. It
returns a 'name' tuple from the author who wrote a 'database'
post.
TRC (tuple relation calculus) can be quantified. We may use
existential (∃) and universal quantifiers (∀) in TRC.
Example 2:
{ R| ∃T ∈ Authors([Link]='database' AND [Link]=[Link])}
Output : This question is going to generate the same result as the
previous one.
Q10) Write the basic concept of entity relationship model?
A10) The Entity-Relationship model (ER model) is a type of entity-
relationship model. It's a data model at a high level. The data
elements and relationships for a given system are described using
this model.
● It creates a database's conceptual architecture. It also
creates a very convenient and easy-to-design data view.
● The database structure is depicted as a diagram called an
entity-relationship diagram in ER modeling.
Let's say we're creating a database for a school. The student will
be an object in this database, with attributes such as address,
name, id, age, and so on. There will be a relationship between the
address and another entity with attributes such as city, street
name, pin code, and so on.
Fig: Example of ER
Component of ER model -
The following components are composed
of an ER Diagram:
1. Entity
2. Attributes
3. Relationships
Fig: Component of ER model
Entity
The object, location, person, or event that stores data in the
database may be an entity. In an object-relationship diagram, a
rectangle represents an entity.
Examples of an individual include a student, course, boss,
employee, patient, etc.
Fig: Entity
Entity type:
A list or a set of entities having certain common attributes is an
entity type. In a database, a name and a list of attributes define
each type of entity.
Entity set:
It is a set (or collection) of entities of the same kind that share
attributes or related properties.
For example, it is possible to describe the category of individuals
who are lecturers at a university as an entity-set lecturer.
Similarly, the collection of students of the organization could
represent the community of all university students.
Attribute
In an Entity-Relationship Model, an attribute defines an entity's
properties or characteristics. It is represented in the ER diagram
by an oval or ellipse shape. Each oval shape represents one
attribute and is directly linked to the person that is in the shape of
the rectangle.
For instance, the attributes defining the Employee form of entity
are employee id, employee name, gender, employee age, salary,
and mobile no.
Fig: Different attributes of employee
In the ER model, the following
categories can be defined by an
attribute:
[Link] attribute:
A simple attribute is called an attribute which contains an atomic
value and can not be divided further. The gender and salary of a
worker, for instance, is also depicted by an oval.
2. Key attribute:
A key attribute is called an attribute that can uniquely identify an
entity in an entity set. It represents a primary key in the ER
diagram. In an Entity Relationship diagram, the key attribute is
denoted by an oval with an underlying line. For example, for each
employee, the employee id would be unique.
3. Composite attribute:
An attribute that is a combination of two or more basic attributes
is called a composite attribute. It is defined by an ellipse in an
Entity-Relationship diagram, and that ellipse consists of other
ellipses. For example, an employee entity type's name attribute
consists of first name, second name, and last name.
[Link] attribute:
A derived attribute is considered an attribute which can be
derived from other attributes. In an entity-relationship diagram, a
dashed oval shape is used to represent these attributes.
Employee age is, for example, a derived attribute since it varies
over time and can be derived from another DOB attribute (Date of
birth).
[Link] attributes:
An attribute which for a given entity contains more than one
value. For instance, there may be more than one mobile number
and email address for an employee.
Q11) Write short notes on relationship sets?
A11) Relationship Sets
In the Entity-Relation Model, a relationship is used to define the
relationship between two or more entities. In the ER diagram, it is
illustrated by a diamond shape.
For example, in college student studies, employees work in a
department. Here, the links are 'research in' and 'works in'.
Degree of Relationship
A partnership is called the degree of a relationship in which a
variety of different individuals participate.
Degree of relationship can be categorized into the following
types:
1. Unary Relationship:
A relationship in which a single group of individuals is involved is
referred to as a unary relationship. For instance, in a company, an
employee manages or supervises another employee.
Fig: Unary relationship set
2. Binary
Relationship: When a
relationship includes two people, it is considered a binary
relationship.
Fig: binary relationship set
3. Ternary Relationship: When a relationship contains three
individual sets, a ternary relationship is called.
Fig:
Ternary relationship set
4. N-nary Relationship:
If a relationship includes more than three entity sets, an n-ary
relationship is named.
Q12) What is weak entity sets?
A12) Although an entity type should contain a key attribute that
uniquely identifies each entity in the entity collection, some entity
types do not have a key attribute. Weak Entity is the name given
to this type of entity.
Weak entity sets are those with insufficient qualities to generate a
primary key, while strong entity sets are those with a primary
key.
Because weak entities lack a primary key, they can't be identified
on their own and must rely on another object (known as owner
entity). In their identifying link with the owner identity, the weak
entities face a whole participation constraint (existence
dependency). Partially keyed entity types exist. Partial Keys are a
set of properties that can be used to distinguish and identify the
tuples of weak entities.
The existence of a weak entity is contingent on the existence of a
strong entity. Weak entity, like a strong entity, does not have a
primary key, but it does have a partial discriminator key. The
double rectangle represents a weak entity. A double diamond
represents the relationship between two strong and weak entities.
In the ER Diagram, weak entities are represented by a double
rectangular box, while identifying linkages are represented by a
double diamond. Dotted lines are used to express partial key
attributes.
Fig: Weak entity set
Q13) Describe mapping cardinality?
A13) Mapping cardinality
● A mapping constraint is a data constraint representing the
number of entities to which a relationship set can relate to
another entity.
● It is most useful to define relationship sets involving more
than two sets of individuals.
● There are four possible mapping cardinalities for the binary
relationship set R on an entity set A and B. They are as follows
1. One to one (1:1)
2. One to many (1:M)
3. Many to many (M:M)
One to one: As we have seen in the example above, only one
instance of an entity is mapped to one instance of another entity.
Consider the Department's HOD. Just one HOD in one department
exists. That is, the relationship between the HOD agency and the
Department is 1:1.
Fig: One to one mapping
One to many: As we can guess now, there is one instance of an
entity linked to many instances of another entity between one
and several relationships. In his department, one manager
oversees several employees.
Fig: One to many mapping
Many to many: This is a relationship where multiple entity
instances are linked to multiple entity instances.
Fig: Many to many mapping
Q14) Explain keys?
A14) Keys are the entity's attributes, which define the entity's
record uniquely.
Several types of keys exist.
These are listed underneath :
● Composite key
A composite key consists of two attributes or more, but it must be
minimal.
● Candidate key
A candidate key is a key that is simple or composite and special
and minimal. It is special since no two rows can have the same
value at any time in a table. It is minimal since, in order to
achieve uniqueness, every column is required.
● Super key
The Super Key is one or more of the entity's attributes that
uniquely define the database record.
● Primary key
The primary key is a candidate key that the database designer
chooses to be used as an identification mechanism for the entire
set of entities. In a table, it must uniquely classify tuples and not
be null.
In the ER model, the primary key is indicated by underlining the
attribute.
● To uniquely recognise tuples in a table, the designer selects a
candidate key. It must not be empty.
● A key is selected by the database builder to be used by the
entire entity collection as an authentication mechanism. This is
regarded as the primary key. In the ER model, this key is
indicated by underlining the attribute.
Fig: Shows different keys
● Alternate key
Alternate keys are all candidate keys not chosen as the primary
key.
● Foreign key
A foreign key (FK) is an attribute in a table that, in another table,
references the primary key OR it may be null. The same data type
must be used for both international and primary keys.
Fig: Foreign key
● Secondary key
A secondary key is an attribute used
exclusively (can be composite) for retrieval purposes, such as:
phone and last name.
Q15) Describe E-R diagrams?
A15) E - R diagrams
● The logical structure of a database can be expressed by an E-
R diagram.
● Relationships are defined in this database model by dividing
the object of interest into an entity and its characteristics into
attributes.
● Using relationships, various entities are related.
● To make it easier for various stakeholders to understand, E-R
models are established to represent relationships in pictorial
form.
The Main Components of ER Diagram are:-
a) Rectangles - Entity Sets
b) Ellipses - Attributes
c) Diamond - Relation among entity sets
d) Lines - Connects attributes to entity sets and entity sets to
relationships.
Fig: E - R diagram
The above E-R diagram has
two entities: Teacher and
Department
Teacher entity has three attributes:
Teacher_id
Teacher_name
Teacher_Subject
Department entity has two attributes:-
Dept_id
Dept_name
Relation Teaches in gives relationship between Teacher,
Department .
Relationships can be of the type:
1:1 → One to one
1:M → One to many
M:1 → Many to one
M:M → Many to many
Q16) What are design issues?
A16) We learned how to create an ER diagram in the earlier
portions of the data modeling. We also explored various
approaches to defining entity sets and their relationships. We also
learned how to use different design shapes to represent a
relationship, an entity, and its attributes. Users, on the other
hand, frequently misunderstand the concept of the elements and
the ER diagram's design process. As a result, the ER diagram has
a complex structure and some flaws that do not match the
characteristics of a real-world business model.
The following are the fundamental design challenges of an ER
database schema:
1. Use of Entity Set vs Attributes
The structure of the real-world enterprise being simulated, as well
as the semantics associated with its properties, determine the
utilization of an entity set or attribute. When a user uses an entity
set's primary key as an attribute of another entity set, it results in
an error. Instead, he should do so through the relationship. In
addition, although the primary key qualities are implied in the
relationship set, we designate them in the relationship sets.
2. Use of Entity Set vs. Relationship Sets
It's tough to tell whether an entity set or a relationship set is the
best way to express an item. The user must specify a relationship
set for describing an activity that occurs between the entities in
order to comprehend and determine the appropriate use. If the
object must be represented as a relationship set, it is best to keep
it separate from the entity set.
3. Use of Binary vs n-ary Relationship Sets
The relationships depicted in databases are often binary
relationships. Non-binary relationships, on the other hand, can be
represented by a number of binary relationships. We can, for
example, define and describe a ternary relationship called
'parent,' which can refer to a child, his father, and his mother.
Such a relationship can also be represented by two binary
partnerships, such as a mother and father who may have a child
together. As a result, a set of discrete binary relationships can be
used to represent a non-binary relationship.
4. Placing Relationship Attributes
In the placement of connection qualities, cardinality ratios can
become an effective measure. As a result, rather than any
relationship set, it is preferable to correlate the attributes of one-
to-one or one-to-many relationship sets with any participating
entity sets. The choice of whether to place the provided attribute
as a relationship or entity attribute should reflect the
characteristics of the real-world business being simulated.
For example, rather than determining it as a separate entity, if
there is an entity that may be identified by a combination of
participating entity sets. Many-to-many relationship sets must be
associated with this sort of characteristic.
As a result, it necessitates a thorough understanding of each
component involved in designing and modeling an ER diagram.
The most basic prerequisite is to examine the real-world
enterprise and the relationships between different entities or
attributes.
Q17) Write short notes on extended ER diagrams?
A17) The Extended Entity-Relationship Model (EERM) is a more
abstract and high-level model that expands the E/R model to
include more forms of relationships and attributes, as well as to
articulate constraints more clearly. The EE/R model includes all of
the concepts found in the E/R model, as well as additional
concepts that cover more semantic information.
In addition to ER model concepts EE-R includes −
● Subclasses and Superclasses.
● Specialization and Generalization.
● Category or union type.
● Aggregation.
These concepts are used to create EE-R diagrams.
Q18) Convert E-R & EER diagram into tables?
A18) Using notations, the database can be represented, and
these notations can be reduced to a set of tables.
Each entity set or relationship set can be represented in tabular
form in the database.
The ER diagram is given below:
Fig: ER diagram
There are some points for
converting the ER diagram
to the table:
● Entity type becomes a table.
In the given ER diagram, LECTURE, STUDENT, SUBJECT and
COURSE forms individual tables.
● All single-valued attribute becomes a column for the table.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the
column of STUDENT table. Similarly, COURSE_NAME and
COURSE_ID form the column of COURSE table and so on.
● A key attribute of the entity type represented by the primary
key.
In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID,
and LECTURE_ID are the key attribute of the entity.
● The multivalued attribute is represented by a separate table.
A hobby is a multivalued feature inside the student table. Thus,
multiple values cannot be expressed in a single column of the
STUDENT table. We therefore generate a STUD HOBBY table with
the STUDENT ID and HOBBY column names. We create a
composite key by using both columns.
● Composite attribute represented by components.
Student address is a composite attribute in the given ER diagram.
CITY, PIN, DOOR#, Path, and STATE are included. These attributes
will merge as an individual column in the STUDENT table.
● Derived attributes are not considered in the table.
In the STUDENT table, the derived attribute is age. It can be
determined by measuring the difference between the present
date and the date of birth at any time.
You can convert the ER diagram to tables and columns using
these rules and assign the mapping between the tables.
The following is the table structure for the given ER diagram:
Fig: Table structure
DBM
Unit - 2
Relational Database Design
Q1) Write the basic concept of relational database?
A1) The primary data model is the Relational Data Model, which
is commonly used for data storage and processing around the
world. This model is simple and has all the features and
functionality needed to process data with efficiency in storage.
The relational model can be interpreted as a table with rows and
columns. Each row is called a tuple. There's a name or attribute
for each table in the column.
Table: Relationships are saved in the format of tables in a
relational data model. The relationship between entities is stored
in this format. A table includes rows and columns, where rows
represent information, and attributes are represented by
columns.
Tuple: A tuple is called a single row of a table, which contains a
single record for that relationship.
Attributes and Domains
Domain: It includes a set of atomic values that can be adopted by
an attribute.
Attribute: In a specific table, it includes the name of a column.
Every Ai attribute must have a domain, a domain (Ai)
Relational instance: The relational example is represented in the
relational database structure by a finite set of tuples. There are
no duplicate tuples for relation instances.
Relational schema: The name of the relationship and the name of
all columns or attributes are used in a relational schema.
Relational key: Each row has one or more attributes in the
relational key. It can uniquely identify the row in the association.
Example:
STUDENT Relation
NAME ROLL_N PHONE_N ADDRES AG
O O S E
Ram 14795 730575899 Noida 24
2
Shyam 12839 902628893 Delhi 35
6
Laxma 33289 858328718 Gurugra 20
n 2 m
Mahes 27857 708681913 Ghaziaba 27
h 4 d
Ganes 17282 9028 Delhi 40
h 9i3988
● In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS,
and AGE are the attributes.
● The instance of schema STUDENT has 5 tuples.
● t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Q2) Write codd’s rule?
A2) After his thorough study into the Relational Model of
Database Systems, Dr. Edgar F. Codd came up with twelve rules
of his own that, according to him, must be followed by a database
in order to be called a true relational database.
These principles can be extended to any database system that
only uses its relational features to handle stored data. This is a
simple rule that serves as the basis for all the other rules.
1. Information Rule: The data contained in a database must
be the value of a table cell, whether it is user data or
metadata. It is important to store everything in a database
in a table format.
2. Guaranteed Access Rule: Each data element must be
accessible by means of the name of the table, its primary key,
and the name of the attribute whose meaning is to be
determined.
3. Systematic Treatment of NULL values: A systematic and
uniform treatment must be given to the NULL values in a
database. This is a very relevant rule since it is possible to
interpret a NULL as one of the following: information is missing,
information is not known or information is not applicable.
4. Active Online Catalog: The definition of the structure of the
whole database must be stored in an online catalogue, known as
a data dictionary, accessible by registered users. The same query
language can be used by users to access the catalogue that they
use to access the database itself.
5. Comprehensive Data Sublanguage Rule: A database should
be available in a language that is supported for the process of
description, manipulation and transaction management.
6. View Updating Rule: Various views that are generated for
different purposes should be automatically modified by the
framework.
7. High level insert, update and delete rule: High-level addition,
upgrading, and removal must be assisted by a database. This
must not be limited to a single row, which means that union,
intersection and minus operations must also be assisted in order
to generate data record sets.
8. Physical data independence: At each relationship level, the
Relational Model should support insert, remove, update, etc.
operations. Set operations such as Union, Intersection and minus
should also be endorsed.
9. Logical data independence: Any alteration of a table's logical
or conceptual schema does not involve modification at the level
of the application. Merging two tables into one, for example, does
not impact access to the application, which is difficult to do.
10. Integrity Independence: Changed integrity restrictions at the
database level do not implement changes at the application level.
11. Distribution Independence: For end-users, the distribution of
data over different locations should not be noticeable.
12. Non-Subversion Rule: Low level access to data should not be
able to circumvent honesty rules to alter data.
Q3) Explain referential integrity?
A3) Domain constraints
● As a description of a valid set of values for an attribute,
domain constraints can be specified.
● The domain data type consists of a string, character, integer,
time, date, currency, etc. In the corresponding domain, the value
of the attribute must be available.
Fig 1: example of domain constraints
Referential integrity
● Between two relations or tables, the referential integrity
constraints are defined and used to preserve the consistency
between the tuples in two relationships.
● If an attribute of the foreign key of the relationship R1 has the
same domain(s) as the primary key of the relationship R2, then
the foreign key of R1 is said to refer to or refer to the primary key
of the relationship R2.
● Foreign key values in the R1 relationship tuple can either take
the primary key values for a certain R2 relationship tuple, or they
can take NULL values, but cannot be zero.
Example
Fig 2: Example of
referential integrity
Enterprise
constraints
Enterprise constraints
are additional rules that
users or database
managers define and
may be based on several tables, often referred to as semantic
constraints.
Some explanations are here.
● There can be a maximum of 30 students for a class.
● A maximum of four classes per semester can be taught by an
instructor.
● An employee is unable to engage in more than five
programmes.
● An employee's compensation cannot exceed the employee's
manager's salary.
Q4) Write the features of a good relation database?
A4) In a database, we have numerous relations, as we all know.
Each relationship must now be identified separately. If this is not
the case, there will be a lot of confusion. We'll go over various
features that, if followed, will automatically distinguish a relation
in a database.
1. Each database connection must have a different or unique
name that distinguishes it from the other database relations.
2. There can't be two attributes with the same name in a relation.
Each attribute must be given a unique name.
3. A relation must not contain duplicate tuples.
4. For each attribute,
each tuple must have
precisely one data value.
For example, you can
see in the first table that
we have enrolled two
pupils, Jhoson and
Charles, for Roll No. 265;
this would not work. For each Roll No, we must have only one
student.
5. A relation's
tuples do not
have to be in
any particular
order because
the relation is not order-sensitive.
6. Similarly, the properties of a relation do not have to be in any
particular order; the developer can specify how the attributes are
ordered.
Q5) Describe normalization?
A5) Normalization is often executed as a series of different forms.
Each normal form has its own properties. As normalization
proceeds, the relations become progressively more restricted in
format, and also less vulnerable to update anomalies. For the
relational data model, it is important to bring the relation only in
first normal form (1NF) that is critical in creating relations. All the
remaining forms are optional.
A relation R is said to be normalized if it does not create any
anomaly for three basic operations: insert, delete and update
Fig 3: types of normal
forms
1NF
If a relation has an
atomic value, it is in
1NF.
2NF
If a relation is in 1NF and all non-key attributes are fully
functioning and dependent on the primary key, it is in 2NF.
3NF
If a relation is in 2NF and there is no transition dependency, it is in
3NF.
4NF
If a relation is in Boyce Codd normal form and has no multivalued
dependencies, it is in 4NF.
5NF
If a relation is in 4NF and does not contain any join dependencies,
it is in 5NF, and joining should be lossless.
Q6) Write the objectives of normalization?
A6) Objectives of Normalization
● It is used to delete redundant information and anomalies in
the database from the relational table.
● Normalization improves by analyzing new data types used in
the table to reduce consistency and complexity.
● Dividing the broad database table into smaller tables and
connecting them using a relationship is helpful.
● This avoids duplicating data into a table or not repeating
classes.
● It decreases the probability of anomalies in a database
occurring.
Q7) Explain First Normal Form (1NF)?
A7) A relation r is in 1NF if and only if every tuple contains only
atomic attributes means exactly one value for each attribute. As
per the rule of first normal form, an attribute of a table cannot
hold multiple values. It should hold only atomic values.
Example: suppose Company has created a employee table to
store name, address and mobile number of employees.
Emp_i Emp_nam Emp_addres Emp_mobile
d e s
101 Rachel Mumbai 9817312390
102 John Pune 7812324252,
9890012244
103 Kim Chennai 9878381212
104 Mary Bangalore 9895000123,
7723455987
In the above table, two employees John, Mary has two mobile
numbers. So, the attribute, emp_mobile is Multivalued attribute.
So, this table is not in 1NF as the rule says, “Each attribute of a
table must have atomic values”. But in above table the the
attribute, emp_mobile, is not atomic attribute as it contains
multiple values. So, it violates the rule of 1 NF.
Following solution brings employee table in 1NF.
Emp_i Emp_nam Emp_addres Emp_mobil
d e s e
101 Rachel Mumbai 981731239
0
102 John Pune 781232425
2
102 John Pune 989001224
4
103 Kim Chennai 987838121
2
104 Mary Bangalore 989500012
3
104 Mary Bangalore 772345598
7
Q8) Write about decomposition using functional
dependencies?
A8) Functional Dependency (FD) is a constraint in a database
management system that specifies the relationship of one
attribute to another attribute (DBMS). Functional Dependency
helps to maintain the database's data quality. Finding the
difference between good and poor database design plays a critical
role.
The arrow "→" signifies a functional dependence. X→ Y is defined
by the functional dependence of X on Y.
Rules of functional dependencies
The three most important rules for Database Functional
Dependence are below:
● Reflexive law: X holds a value of Y if X is a set of attributes
and Y is a subset of X.
● Augmentation rule: If x -> y holds, and c is set as an
attribute, then ac -> bc holds as well. That is to add attributes
that do not modify the fundamental dependencies.
● Transitivity law: If x -> y holds and y -> z holds, this rule is
very similar to the transitive rule in algebra, then x -> z also
holds. X -> y is referred to as functionally evaluating y.
Types of functional dependency
Fig 4: Types of functional dependency
1. Trivial functional dependency
● A → B has trivial functional dependency if B is a subset of A.
● The following dependencies are also trivial like: A → A, B → B
Example
Consider a table with two columns Employee_Id and
Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial
functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name →
Employee_Name are trivial dependencies too.
2. Non - trivial functional dependencies
● A → B has a non-trivial functional dependency if B is not a
subset of A.
● When A intersection B is NULL, then A → B is called as
complete non-trivial.
Example:
ID → Name,
Name → DOB
Q9) Write the algorithm for decomposition?
A9) By explicitly generating a schema for each dependency in the
canonical cover, the decomposition procedure for 3NF ensures
that dependencies are preserved. It assures that at least one
schema has a candidate key for the one being decomposed,
ensuring that the decomposition created is a lossless
decomposition.
Decomposition Algorithm
Let Fc be a canonical cover for F;
i=0;
For each functional dependency α->β in Fc
i = i+1;
R = αβ;
If none of the schemas Rj, j=1,2,…I holds a candidate key for R
Then
i = i+1;
Ri= any candidate key for R;
/* Optionally, remove the repetitive relations*/
Repeat
If any schema Rj is contained in another schema Rk
Then
/* Delete Rj */
Rj = Ri;
i = i-1;
Until no more Rjs can be deleted
Return (R1, R2, . . . ,Ri)
The supplied relation is R, and the given collection of functional
dependencies is F, for which Fc maintains the canonical cover.
The decomposed portions of the given relation R are R1, R2,..., Ri.
As a result, this technique preserves the dependency while also
generating a lossless decomposition of R.
A 3NF synthesis algorithm is another name for a 3NF algorithm.
It's called so because the regular form works with a dependency
set and adds one schema at a time, rather than repeatedly
dissecting the basic schema.
BCNF
It is important to check if the given relation is in Boyce-Codd
Normal Form before applying the BCNF decomposition technique
on it. If it is discovered that the supplied relation is not in BCNF
after the test, we can decompose it further to produce BCNF
relations.
The following situations necessitate determining whether the
supplied relation schema R follows the BCNF rule:
Case 1: Evaluate and compute α+, i.e., the attribute closure of to
see if a nontrivial dependency α -> β violates the BCNF rule.
Check that + contains all of the attributes of the supplied relation
R. It should, therefore, be the super key of relation R.
Case 2: It is not necessary to test all of the dependencies in F + if
the specified relation R is in BCNF. For the BCNF test, all that is
required is detecting and checking the dependencies in the
specified dependency list F. It's because if no dependent in F
violates BCNF, then none of the F+ dependencies will as well.
Decomposition Algorithm
If the supplied relation R is deconstructed into numerous relations
R1, R2,..., Rn since it was not found in the BCNF, this algorithm is
employed. Thus,
We must validate that α+ (an attribute closure of under F) either
includes all the attributes of the relation R i or no attribute of Ri-α
for each subset of attributes in the relation Ri.
Result={R};
Done=false;
Compute F+;
While (not done) do
If (there is a schema Ri in result that is not in BCNF)
Then begin
Let α->β be a nontrivial functional dependency that holds
On Ri such that α->Ri is not in F+, and α ꓵ β= ø;
Result=(result-Ri) U (Ri-β) U (α,β);
End
Else done=true;
This procedure is used to break down a given relation R into its
decomposers. This approach does the breakdown of the relation R
using dependencies that demonstrate the violation of BCNF. As a
result, such an algorithm not only generates relation R
decomposers in BCNF, but it is also a lossless decomposition. It
signifies that no data is lost when the specified relation R is
decomposed into R1, R2, and so on...
The time it takes for the BCNF decomposition procedure to
complete is proportional to the size of the original relation
schema R. As a result, one disadvantage of this technique is that
it may excessively breakdown the given relation R, i.e., over-
normalize it.
The decomposition methods for BCNF and 4NF are nearly
identical, with one exception. The fourth normal form is
concerned with multivalued dependencies, while BCNF is
concerned with functional dependencies. The multivalued
dependencies aid in reducing data repetition, which is difficult to
comprehend in terms of functional relationships.
Q10) Explain second normal forms?
A10) Second Normal Form (2NF)
A table is said to be in 2NF if both of the following conditions are
satisfied:
● Table is in 1 NF.
● No non-prime attribute is dependent on the proper subset of
any candidate key of table. It means a non-prime attribute should
fully functionally depend on the whole candidate key of a table. It
should not depend on part of the key.
An attribute that is not part of any candidate key is known as a
non-prime attribute.
Example:
Suppose a school wants to store data of teachers and the subjects
they teach. Since a teacher can teach more than one subject, the
table can have multiple rows for the same teacher.
Teacher_i Subjec Teacher_ag
d t e
111 DSF 28
111 DBMS 28
222 CNT 35
333 OOPL 38
333 FDS 38
For above table:
Candidate Keys: {Teacher_Id, Subject}
Non prime attribute: Teacher_Age
The table is in 1 NF because each attribute has atomic values.
However, it is not in 2NF because non-prime attribute
Teacher_Age is dependent on Teacher_Id alone which is a proper
subset of candidate key. This violates the rule for 2NF as the rule
says “no non-prime attribute is dependent on the proper subset of
any candidate key of the table”.
To bring above table in 2NF we can break it in two tables
(Teacher_Detalis and
Teacher_Subject) like this:
Teacher_Details table:
Teacher_i Teacher_ag
d e
111 28
222 35
333 38
Teacher_Subject table:
Teacher_i Subjec
d t
111 DSF
111 DBMS
222 CNT
333 OOPL
333 FDS
Now these two tables are in 2NF.
Q11) Describe Third Normal Form (3NF)?
A11) A table design is said to be in 3NF if both the following
conditions hold:
● Table must be in 2NF.
● Transitive functional dependency from the relation must be
removed.
So it can be stated that, a table is in 3NF if it is in 2NF and for
each functional dependency
P->Q at least one of the following conditions hold:
● P is a super key of table
● Q is a prime attribute of table
An attribute that is a part of one of the candidate keys is known
as a prime attribute.
Transitive functional dependency:
A functional dependency is said to be transitive if it is indirectly
formed by two functional dependencies.
For example:
P->R is a transitive dependency if the following three functional
dependencies hold true:
1) P->Q and
2) Q->R
Example: suppose a company wants to store information about
employees. Then the table Employee_Details looks like this:
Emp_i Emp_nam Manager_i Mgr_Dep Mgr_Nam
d e d t e
E1 Hary M1 IT William
E2 John M1 IT William
E3 Nil M2 SALES Stephen
E4 Mery M3 HR Johnson
E5 Steve M2 SALSE Stephen
Super keys: {emp_id}, {emp_id, emp_name}, {emp_id,
emp_name, Manager_id}
Candidate Key: {emp_id}
Non-prime attributes: All attributes except emp_id are non-prime
as they are not subpart part of any candidate keys.
Here, Mgr_Dept, Mgr_Name depend on Manager_id.
And, Manager_id is dependent on emp_id that makes non-prime
attributes (Mgr_Dept, Mgr_Name) transitively dependent on super
key (emp_id). This violates the rule of 3NF.
To bring this table in 3NF we have to break into two tables to
remove transitive dependency.
Employee_Details table:
Emp_i Emp_nam Manager_i
d e d
E1 Hary M1
E2 John M1
E3 Nil M2
E4 Mery M3
E5 Steve M2
Manager_Details table:
Manager_i Mgr_Dep Mgr_Nam
d t e
M1 IT William
M2 SALSE Stephen
M3 HR Johnson
Q12) Explain Fourth Normal Form (4NF) and BCNF?
A12) A relation is in the 4NF if it is in BCNF and has no multi
valued dependencies.
It means relation R is in 4NF if and only if whenever there exist
subsets A and B of the
Attributes of R such that the Multi valued dependency AààB is
satisfied then all attributes of R are also functionally dependent
on A.
Multi-Valued Dependency
● The multi-valued dependency X -> Y holds in a relation R if
whenever we have two tuples of R that agree (same) in all the
attributes of X, then we can swap their Y components and get two
new tuples that are also in R.
● Suppose a student can have more than one subject and more
than one activity.
Boyce Codd normal form (BCNF)
It is an advanced version of 3NF. BCNF is stricter than 3NF. A
table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.
Example: Suppose there is a company wherein employees work in
more than one
Department. They store the data like this:
Emp_i Emp_nationali Emp_dept Dept_typ Dept_no_of_em
d ty e p
101 Indian Planning D01 100
101 Indian Accounting D01 50
102 Japanese Technical D14 300
support
102 Japanese Sales D14 100
Functional dependencies in the table above:
Emp_id ->emp_nationality
Emp_dept -> {dept_type, dept_no_of_emp}
Candidate key: {emp_id, emp_dept}
The table is not in BCNF as neither emp_id nor emp_dept alone
are keys. To bring this table in BCNF we can break this table in
three tables like:
Emp_nationality table:
Emp_i Emp_nationali
d ty
101 Indian
102 Japanese
Emp_dept table:
Emp_dept Dept_typ Dept_no_of_em
e p
Planning D01 100
Accounting D01 50
Technical D14 300
support
Sales D14 100
Emp_dept_mapping table:
Emp_i Emp_dept
d
101 Planning
101 Accounting
102 Technical
support
102 Sales
Functional dependencies:
Emp_id ->; emp_nationality
Emp_dept -> {dept_type, dept_no_of_emp}
Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}
This is now in BCNF as in both functional dependencies the left
side is a key.
Q13) Write the difference between 3NF and BCNF?
A13) Difference between 3NF and BCNF
[Link]
3NF BCNF
.
In 3NF there should be no
transitive dependency that is no In BCNF for any relation
1. non prime attribute should be A->B, A should be a
transitively dependent on the super key of relation.
candidate key.
It is comparatively more
2. It is less stronger than BCNF.
stronger than 3NF.
In 3NF the functional In BCNF the functional
3. dependencies are already in 1NF dependencies are
and 2NF. already in 1NF, 2NF and
3NF.
The redundancy is
4. The redundancy is high in 3NF. comparatively low in
BCNF.
In BCNF there may or
In 3NF there is preservation of all may not be preservation
5.
functional dependencies. of all functional
dependencies.
It is comparatively easier to
6. It is difficult to achieve.
achieve.
Lossless decomposition
Lossless decomposition can be
7. is hard to achieve in
achieved by 3NF.
BCNF.
Q14) Given a relation R( A, B, C, D) and Functional
Dependency set FD = { AB → CD, B → C }, determine
whether the given R is in 2NF? If not, convert it into 2 NF.
A14) Let's make an arrow diagram in R and use FD to find the
candidate key.
We can see from the above arrow diagram
on R that an attribute AB is not
determined by any of the supplied FD,
hence AB will be an integral component of
the Candidate key, i.e. no matter what the candidate key is or
how many there are, they will all have the W required attribute.
Let's see how we can figure out the closure of AB.
AB+ = ABCD
Since the closure of AB contains all the attributes of R, hence AB
is the Candidate Key.
The definition of Candidate Key can be found here (Candidate Key
is a Super Key whose no proper subset is a Super key).
Because AB is a fundamental part of all keys, and we've
established that AB is a Candidate Key, any superset of AB will be
Super Key but not Candidate Key.
As a result, there will only be one candidate key AB.
Definition of 2NF: No non-prime attribute should be partially
dependent on Candidate Key
Because R contains four characteristics: A, B, C, and D, and the
Candidate Key is AB, prime attributes (parts of the candidate key)
are A and B, whereas non-prime attributes are C and D.
a) FD: AB → CD satisfies the definition of 2NF, that non-prime
attribute(C and D) are fully dependent on candidate key AB
b) FD: B → C does not satisfy the definition of 2NF, as a non-prime
attribute(C) is partially dependent on candidate key AB( i.e. key
should not be broken at any cost)
As FD B → C, the above table R( A, B, C, D) is not in 2NF
Convert the table R(A, B, C, D) in 2NF:
Since FD: B → C, our table was not in 2NF, let's decompose the
table
R1(B, C)
Since the key is AB, and from FD AB → CD, we can create R2(A, B,
C, D) but this will again have a problem of partial dependency B
→ C, hence R2(A, B, D).
Finally, the decomposed table which is in 2NF
a) R1( B, C)
b) R2(A, B, D)
Q15) Write the advantages of relational databases?
A15) The main advantages of relational databases are that they
enable users to easily categorize and store data that can later be
queried and filtered to extract specific information for reports.
Relational databases are also easy to extend and aren't reliant on
the physical organization. After the original database creation, a
new data category can be added without all existing applications
being modified.
● Accurate − Data is stored just once, which eliminates data
deduplication.
● Flexible − Complex queries are easy for users to carry out.
● Collaborative −Multiple users can access the same
database.
● Trusted −Relational database models are mature and well-
understood.
● Secure − Data in tables within relational database
management systems (RDBMS) can be limited to allow access by
only particular users.
Q16) Given a relation R( P, Q, R, S, T, U, V, W ) and
Functional Dependency set FD = { PQ → R, P → ST, Q → U,
and U → VW }, determine given R is in which normal form?
A16) Let us construct an arrow diagram on R using FD to
calculate the candidate key.
We
can
see
from the above arrow diagram on R that the attribute PQ is not
dictated by any of the supplied FD, hence PQ will be an integral
element of the Candidate key, i.e. no matter what the candidate
key is or how many there are, they will all have the PQ required
attribute.
Let us calculate the closure of PQ
PQ + = P Q R S T U V W (from the closure method we studied
earlier)
Since the closure of PQ contains all the attributes of R, hence PQ
is Candidate Key
From the definition of Candidate Key (Candidate Key is a Super
Key whose no proper subset is a Super key)
Since all key will have PQ as an integral part, and we have proved
that PQ is Candidate Key, Therefore, any superset of PQ will be
Super Key but not Candidate key.
Hence there will be only one candidate key PQ
Since R has 8 attributes: - P, Q, R, S, T, U, V, W and Candidate Key
is PQ Therefore, prime attribute (part of candidate key) are P and
Q while a non-prime attribute is R S T U V W
Given FD are { PQ → R, P → ST, Q → U, and U → VW } and Super
Key / Candidate Key is PQ
1. FD: PQ → R satisfies the definition of BCNF, as PQ is Super
Key, hence no need to check it for further normal forms, as
it satisfies the highest one. Now we check another
dependency in a reverse engineering manner.
b. FD: P → ST does not satisfy the definition of BCNF, as P is not
Super Key, hence table is not in BCNF (because if one
dependency fails, all fails) now we check the same FD for 3NF.
c. FD: P → ST even does not satisfy the definition of 3NF, as P
is not Super Key or S T is not a prime attribute, hence table is not
in 3NF also (because if one dependency fails, all fails) now we
check same FD for 2NF.
d. FD: P → ST even does not satisfy the definition of 2NF, as P is
not Super Key and S T which is not prime attribute depending on
part of the key (partial dependency), hence table is not in 2NF
also (because if one dependency fails, all fails).
Hence from the above three statements b, c, and d we can say
that table R ( P, Q, R, S, T, U, V, W, ) is in 1NF only.
Q17) Given a relation R( P, Q, R, S, T, U ) and Functional
Dependency set FD = { PQ → R, SR→ PT, T → U },
determine given R is in which normal form?
A17) Let us construct an arrow diagram on R using FD to
calculate the candidate key.
From the above arrow diagram on R, we can see that an attribute
QS is not determined by any of the given FD, hence QS will be the
integral part of the Candidate key, i.e. no matter what will be the
candidate key, and how many will be the candidate key, but all
will have QS compulsory attribute.
Let us calculate the closure of QS
QS + = QS (from the closure method we studied earlier)
Since closure QS does not contain all the attributes of R, hence
QS is not the Candidate key.
On making a combination of QS with another attribute, we found
that PQS and RQS determine all the attributes of R, hence PQS
and RQS are candidate keys of R.
Since R has 6 attributes: - P, Q, R, S, T, U and Candidate Key is
PQS and RQS, Therefore, prime attributes (part of candidate key)
are P Q R and S while a non-prime attribute is T U
Given FD are { PQ → R, SR→ PT, T → U } and Super Key /
Candidate Key is PQS and RQS
1. FD: PQ → R does not satisfy the definition of BCNF, as PQ is
not Super Key, hence table is not in BCNF (because if one
dependency fails, all fails) now we check the same FD for
3NF
2. FD: PQ → R satisfies the definition of 3NF, even though PQ
is not Super Key but R is attributed, hence the table is in
3NF now we check other FD's using reverse engineering
process.
3. FD: SR → PT does not satisfy the definition of 3NF, as SR is
not Super Key or P T is not prime attribute (as P is prime
but the combination should be prime attribute), hence
table is not in 3NF (because if one dependency fails, all
fails). Now we check the same FD for 2NF
4. FD: SR → PT does not satisfy the definition of 2NF, as SR is
not Super Key or P T which is not prime attribute
depending on part of the key (partial dependency), hence
table is not in 2NF also (because if one dependency fails,
all fails).
Hence from the above two statements c and d, we can say that
table R ( P, Q, R, S, T, U) is in 1NF only.
DBM
Unit - 3
Basics of SQL
Q1) What is DDL?
A1) In reality, the DDL or Data Description Language consists of
SQL commands that can be used to define the schema for the
database. It basically deals with database schema definitions and
is used to construct and change the configuration of database
objects in the database.
Some of the commands that fall under DDL are as follows:
● CREATE
● ALTER
● DROP
● TRUNCATE
CREATE
It is used in the database for the development of a new table
Syntax
CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);
Example
CREATE TABLE STUDENT(Name VARCHAR2(20), Address
VARCHAR2(100), DOB DATE);
ALTER
It is used to modify the database structure. This move may be
either to change the characteristics of an existing attribute or to
add a new attribute, likely.
Syntax
In order to add a new column to the table:
ALTER TABLE table_name ADD column_name COLUMN-
definition;
To change the current table column:
ALTER TABLE MODIFY(COLUMN DEFINITION....);
Example
ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20));
ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20));
Q2) Write about DML (Data Manipulation Language)?
A2) DML queries are used to manipulate the data stored in table.
Data Manipulation Language commands are used to insert,
retrieve and modify the data contained within it.
S . Command & description
No
SELECT
1. Retrieves certain records from one or more tables.
Syntax
SELECT * FROM tablename.
OR
SELECT columnname,columnname,…..
FROM tablename ;
INSERT
2. Creates a record in tables.
Example 1: Inserting a single row of data into a table
Syntax
INSERT INTO table_name
[(columnname,columnname)]
VALUES (expression,expression);
To add a new employee to the personal_info table
Example
INSERT INTO customer values(1,’Ram’,’Pune’,3333444488)
Example 2 : Inserting data into a table from another table
Syntax
INSERT INTO tablename
SELECT columnname,columnname FROM tablename
UPDATE
3. Modifies record stored in table.
The UPDATE command can be used to modify information
contained within a table.
Syntax
UPDATE customer SET cust_Address=’mumbai’ where
cust_id=1;
DELETE
4. Deletes records stored in table.
The DELETE command can be used to delete information
contained within a table.
Syntax
DELETE FROM tablename
WHERE search condition
The DELETE command with a WHERE clause can be used to
remove his record from
The customer table:
Example
DELETE FROM customer
WHERE cust_id=12;
The following command deletes all the rows from the table
Example DELETE FROM customer;
Q3) Describe DCL (Data Control Language)?
A3) GRANT and REVOKE are commands in the DCL (Data Control
Language) that can be used to grant "rights and permissions."
The database system's parameters are controlled by other
permissions.
DCL commands include the following:
1. Grant
2. Revoke
Grant
This command is used to grant a user database access
capabilities.
Syntax
GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER,
ANOTHER_USER;
Example
GRANT SELECT ON Users TO'Tom'@'localhost;
Revoke
It's a good idea to back up the user's permissions.
Syntax
REVOKE privilege_nameON object_nameFROM {user_name |
PUBLIC |role_name}
Example
REVOKE SELECT, UPDATE ON student FROM Btech, Mtech;
Q4) Explain defining constraints?
A4) Constraints are rules that can be applied to a table's data
type. That is, we may use constraints to limit the type of data that
can be recorded in a specific column in a table.
Constraints set restrictions on how much and what kind of data
can be inserted, modified, and deleted from a table. Constraints
are used to ensure data integrity during an update, removal, or
insert into a table.
Primary key
The primary key is a field that uniquely identifies each table row.
If a column in a table is designated as a primary key, it cannot
include NULL values, and all rows must have unique values for
this field. To put it another way, this is a combination of NOT NULL
and UNIQUE constraints.
The ROLL NO field is marked as primary key in the example
below, which means it cannot contain duplicate or null entries.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
Foriegn key
The columns of a table that point to the primary key of another
table are known as foreign keys. They serve as a table-to-table
cross-reference.
A foreign key is a table field that uniquely identifies each row of a
different table. That is, this field refers to a table's main key. This
usually results in a connection between the tables.
Consider the following two tables:
Orders
O_ID ORDER_NO C_ID
1 2253 3
2 3325 3
3 4521 2
4 8532 1
Customers
C_ID NAME ADDRESS
1 RAMESH DELHI
2 SURESH NOIDA
3 DHARMESH GURGAON
The field C_ID in the Orders table is clearly the main key in the
Customers table, i.e. it uniquely identifies each row in the
Customers dataset. As a result, the Orders table has a Foreign
Key.
Syntax
CREATE TABLE Orders
O_ID int NOT NULL,
ORDER_NO int NOT NULL,
C_ID int,
PRIMARY KEY (O_ID),
FOREIGN KEY (C_ID) REFERENCES Customers(C_ID)
Unique key
UNIQUE constraint forces the values of a column or set of columns
to be unique. If a column has a unique constraint, it signifies that
no two values in the table can be the same.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
Not null
The NOT NULL constraint ensures that no NULL values are stored
in a column. When we don't specify a value for a field when
inserting a record into a table, the column defaults to NULL. We
can ensure that a certain column(s) cannot have NULL values by
specifying a NULL constraint.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);
Check
This constraint is used to specify a table's range of values for a
certain column. When this constraint is applied to a column, it
assures that the value of the specified column must be inside the
provided range.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
The check constraint on the ROLL NO column of the STUDENT
database was set in the preceding example. The value of the
ROLL NO field must now be bigger than 1000.
Q5) Describe aggregate function?
A5) These functions are used to perform operations on the
column's values and return a single value.
● AVG()
● COUNT()
● FIRST()
● LAST()
● MAX()
● MIN()
● SUM()
AVG() -
After calculating from values in a numeric column, it returns the
average value.
Syntax
SELECT AVG(column_name) FROM table_name;
Queries -
1. Calculating students' average grades.
SELECT AVG(MARKS) AS AvgMarks FROM Students;
Output
AvgMarks
80
1. Calculating the average student age.
SELECT AVG(AGE) AS AvgAge FROM Students;
Output
AvgAge
19.4
COUNT() -
It's used to count how many rows a SELECT command returns. It
isn't compatible with MS ACCESS.
Syntax
SELECT COUNT(column_name) FROM table_name;
Queries
1. The total number of students is calculated.
SELECT COUNT(*) AS NumStudents FROM Students;
Output:
NumStudents
1. Counting the number of students who are of a specific age.
SELECT COUNT(DISTINCT AGE) AS NumStudents FROM Students;
Output
NumStudents
4
FIRST() -
The FIRST() function returns the selected column's first value.
Syntax
SELECT FIRST(column_name) FROM table_name;
Queries
1. Taking the first student's marks from the Students table.
SELECT FIRST(MARKS) AS MarksFirst FROM Students;
Output:
MarksFirst
90
1. The first student's age is retrieved from the Students table.
SELECT FIRST(AGE) AS AgeFirst FROM Students;
Output:
AgeFirst
19
LAST() -
The LAST() function returns the chosen column's last value. It can
only be used in MS ACCESS.
Syntax
SELECT LAST(column_name) FROM table_name;
Queries
1. Taking the final student's grades from the Students table.
SELECT LAST(MARKS) AS MarksLast FROM Students;
Output:
MarksLast
82
1. Obtaining the age of the most recent student from the
Students table.
SELECT LAST(AGE) AS AgeLast FROM Students;
Output:
AgeLast
18
MAX() -
The MAX() method returns the selected column's maximum value.
Syntax
SELECT MAX(column_name) FROM table_name;
Queries
1. Obtaining the highest possible grade among students from
the Students table.
SELECT MAX(MARKS) AS MaxMarks FROM Students;
Output:
MaxMarks
95
1. The maximum age among students is retrieved from the
Students table.
SELECT MAX(AGE) AS MaxAge FROM Students;
Output:
MaxAge
21
MIN() -
The MIN() method returns the selected column's minimal value.
Syntax
SELECT MIN(column_name) FROM table_name;
Queries
1. Obtaining the lowest possible grade among students from
the Students table.
SELECT MIN(MARKS) AS MinMarks FROM Students;
Output:
MinMarks
50
1. Obtaining the student's minimum age from the Students
table.
SELECT MIN(AGE) AS MinAge FROM Students;
Output:
MinAge
18
SUM() -
SUM() returns the total of all the values in the chosen column.
Syntax
SELECT SUM(column_name) FROM table_name;
Queries
1. Obtaining the overall score of all students from the
Students table.
SELECT SUM(MARKS) AS TotalMarks FROM Students;
Output:
TotalMarks
400
1. Obtaining the total age of all students from the Students
table.
SELECT SUM(AGE) AS TotalAge FROM Students;
Output:
TotalAge
97
Q6) What is built - in function?
A6) In order to describe the values that a column can hold, SQL
Datatype is used.
Each column is needed in the database table to have a name and
data type.
Fig 1: sql data types
1. Numeric data types
The subtypes are given
below:
Data From To Description
type
Float -1.79E + 1.79E + Used to specify a floating-point
308 308 value
Real -3.40e + 3.40E + Specifies a single precision floating
38 38 point number
2. Character String data types
Data Description
types
Char It contains Fixed-length (max - 8000 character)
Varchar It contains variable-length (max - 8000 character)
Text It contains variable-length (max - 2,147,483,647
character)
3. Date and Time data types
Data Description
types
Date Used to store the year, month, and days value.
Time Used to store the hour, minute, and second values.
Timestam Stores the year, month, day, hour, minute, and the
p second value.
Q7) Write about set operations?
A7) mysql> create database set1;
Query OK, 1 row affected (0.00 sec)
Mysql> use set1;
Database changed
CREATING FIRST TABLE:
Mysql> create table A (x int (2), y varchar(2));
Query OK, 0 rows affected (0.19 sec)
Mysql> select * from A;
+------+------+
|x|y|
+------+------+
|1|a|
|2|b|
|3|c|
|4|d|
+------+------+
4 rows in set (0.00 sec)
CREATING SECOND TABLE:
Mysql> create table B (x int (2), y varchar(2));
Query OK, 0 rows affected (0.22 sec)
Mysql> select * from B;
+------+------+
|x|y|
+------+------+
|1|a|
|3|c|
+------+------+
2 rows in set (0.00 sec)
● INTERSECTION OPERATOR:
Mysql> select A.x,A.y from A join B using (x,y);
+------+------+
|x|y|
+------+------+
|1|a|
|3|c|
+------+------+
Mysql> select * from A where (x,y) in (select * from B);
+------+------+
|x|y|
+------+------+
|1|a|
|3|c|
+------+------+
2 rows in set (0.00 sec)
● UNION OPERATOR:
Mysql> select * from A union (select * from B);
+------+------+
|x|y|
+------+------+
|1|a|
|2|b|
|3|c|
|4|d|
+------+------+
4 rows in set (0.00 sec)
Mysql> select * from A union all(select * from B);
+------+------+
|x|y|
+------+------+
|1|a|
|2|b|
|3|c|
|4|d|
|1|a|
|3|c|
+------+------+
6 rows in set (0.00 sec)
● DIFFERENCE OPERATOR:
Mysql> select * from A where not exists (select * from B where
A.x =B.x and A.y=B.y);
+------+------+
|x|y|
+------+------+
|2|b|
|4|d|
+------+------+
2 rows in set (0.11 sec)
● SYMMETRIC DIFFERENCE OPERATOR:
Mysql> (select * from A where not exists (select * from B where
A.x =B.x and A.y=B.y))
Union( select * from B where not exists (select * from A where A.x
=B.x and A.y=B.y));
+------+------+
|x|y|
+------+------+
|2|b|
|4|d|
+------+------+
Q8) What do you mean by sub queries ?
A8) A subquery, also known as an inner query or nested query, is
a query that is placed within another SQL query's WHERE clause.
A subquery is used to return data that will be utilized as a
condition in the main query to further limit the data that may be
retrieved.
Subqueries can be used with SELECT, INSERT, UPDATE, and
DELETE statements, as well as operators such as =, >, >=, =, IN,
BETWEEN, and so on.
Subqueries must adhere to a set of guidelines.
● Parentheses must be used to surround subqueries.
● Unless many columns are in the main query for the subquery
to compare its selected columns, a subquery can only have one
column in the SELECT clause.
● An ORDER BY command cannot be used in a subquery,
although it can be used in the main query. In a subquery, the
GROUP BY command can accomplish the same function as the
ORDER BY command.
● Multiple value operators, such as the IN operator, can only be
used with subqueries that return more than one row.
● There can't be any references to values that evaluate to a
BLOB, ARRAY, CLOB, or NCLOB in the SELECT list.
● A set function cannot immediately enclose a subquery.
● A subquery cannot be used with the BETWEEN operator.
Within the subquery, however, the BETWEEN operator can be
utilized.
Subqueries are most frequently used with the SELECT statement,
however you can use them within a INSERT, UPDATE, or DELETE
statement as well, or inside another subquery.
● Subqueries with the SELECT Statement
The following statement will return the details of only those
customers whose order value in the orders table is more than
5000 dollar. Also note that we’ve used the
Keyword DISTINCT in our subquery to eliminate the duplicate
cust_id values from the
Result set.
1. SELECT * FROM customers WHERE cust_id IN (SELECT
DISTINCT cust_id
FROM orders WHERE order_value > 5000);
● Subqueries with the INSERT Statement
Subqueries can also be used with INSERT statements. Here’s an
example:
INSERT INTO premium_customers
SELECT * FROM customers
WHERE cust_id IN (SELECT DISTINCT cust_id FROM orders
WHERE order_value > 5000);
The above statement will insert the records of premium
customers into a table
Called premium_customers, by using the data returned from
subquery. Here the premium customers are the customers who
have placed orders worth more than 5000 dollar.
● Subqueries with the UPDATE Statement
You can also use the subqueries in conjunction with the UPDATE
statement to update the single or multiple columns in a table, as
follow:
UPDATE orders
SET order_value = order_value + 10
WHERE cust_id IN (SELECT cust_id FROM customers
WHERE postal_code = 75016);
The above statement will update the order value in the orders
table for those customers who live in the area whose postal code
is 75016, by increasing the current order value by 10 dollar.
Q9) How to use a group by?
A9) To organize identical data into groups, the SQL GROUP BY
clause is used in conjunction with the SELECT command. In a
SELECT statement, the GROUP BY clause comes after the WHERE
clause and before the ORDER BY clause.
Syntax
The following code block illustrates the basic syntax of a GROUP
BY clause. The GROUP BY clause must come after the WHERE
clause's conditions and, if one is used, before the ORDER BY
clause.
SELECT column1, column2
FROM table_name
WHERE [ conditions ]
GROUP BY column1, column2
ORDER BY column1, column2
Example
Consider the following records in the CUSTOMERS table:
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
The GROUP BY query might be as follows if you want to know the
total amount of each customer's salary.
SQL> SELECT NAME, SUM(SALARY) FROM CUSTOMERS
GROUP BY NAME;
This would result in the following:
+----------+-------------+
| NAME | SUM(SALARY) |
+----------+-------------+
| Chaitali | 6500.00 |
| Hardik | 8500.00 |
| kaushik | 2000.00 |
| Khilan | 1500.00 |
| Komal | 4500.00 |
| Muffy | 10000.00 |
| Ramesh | 2000.00 |
+----------+-------------+
Q10) What is having clause?
A10) You can use the HAVING Clause to establish conditions that
limit which group results appear in the results.
The WHERE clause applies conditions to the columns that have
been chosen, but the HAVING clause applies conditions to the
groups that have been generated by the GROUP BY clause.
Syntax
The HAVING Clause is shown in the following code block in a
query.
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
In a query, the HAVING clause must come after the GROUP BY
clause and, if used, before the ORDER BY clause. The syntax of
the SELECT statement, including the HAVING clause, is shown in
the following code block.
SELECT column1, column2
FROM table1, table2
WHERE [ conditions ]
GROUP BY column1, column2
HAVING [ conditions ]
ORDER BY column1, column2
Example
Consider the following records in the CUSTOMERS table:
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
The following is an example of a record for a similar age count
that is more than or equal to two.
SQL > SELECT ID, NAME, AGE, ADDRESS, SALARY
FROM CUSTOMERS
GROUP BY age
HAVING COUNT(age) >= 2;
This would result in the following:
+----+--------+-----+---------+---------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+--------+-----+---------+---------+
| 2 | Khilan | 25 | Delhi | 1500.00 |
+----+--------+-----+---------+---------+
Q11)How to use order by?
A11) The SQL ORDER BY clause is used to sort data by one or
more columns in ascending or descending order. By default, some
databases sort query results in ascending order.
Syntax
The ORDER BY clause's basic grammar is as follows:
SELECT column-list
FROM table_name
[WHERE condition]
[ORDER BY column1, column2, .. ColumnN] [ASC | DESC];
In the ORDER BY clause, you can utilize more than one column.
Make sure that whichever column you're using to sort is included
in the column-list.
Example
Consider the following records in the CUSTOMERS table:
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
The following code block shows an example of sorting the results
by NAME and SALARY in ascending order.
SQL> SELECT * FROM CUSTOMERS
ORDER BY NAME, SALARY;
This would result in the following:
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
+----+----------+-----+-----------+----------+
Q12) Explain join and its types?
A12) To get data from various tables, SQL JOINS are needed.
When two or more tables are listed in a SQL statement, a SQL
JOIN is done.
SQL joins can be divided into four categories:
● SQL INNER JOIN (sometimes called simple join)
● SQL LEFT OUTER JOIN (sometimes called LEFT JOIN)
● SQL RIGHT OUTER JOIN (sometimes called RIGHT JOIN)
● SQL FULL OUTER JOIN (sometimes called FULL JOIN)
INNER JOIN
As long as the condition is met, the INNER JOIN keyword selects
all rows from both tables. This keyword will generate a result-set
by combining all rows from both tables that satisfy the
requirement, i.e. the common field's value will be the same.
Syntax
SELECT columns
FROM table1
INNER JOIN table2
ON [Link] = [Link];
Fig 2: inner join
LEFT OUTER JOIN
This join retrieves all rows from the table on the left side of the
join, as well as matching rows from the table on the right. The
result-set will include null for the rows for which there is no
matching row on the right side. LEFT OUTER JOIN is another name
for LEFT JOIN.
Syntax
SELECT columns
FROM table1
LEFT [OUTER] JOIN table2
ON [Link] = [Link];
Fig 3: left outer join
RIGHT OUTER JOIN
The RIGHT JOIN function is analogous to the LEFT JOIN function.
This join retrieves all rows from the table on the right side of the
join, as well as matching rows from the table on the left. The
result-set will include null for the rows for which there is no
matching row on the left side. RIGHT OUTER JOIN is another name
for RIGHT JOIN.
Syntax
SELECT columns
FROM table1
RIGHT [OUTER] JOIN table2
ON [Link] = [Link];
Fig 4: right outer join
FULL OUTER JOIN
The result-set of FULL JOIN is created by combining the results of
both LEFT JOIN and RIGHT JOIN. All of the rows from both tables
will be included in the result-set. The result-set will contain NULL
values for the rows for which there is no match.
Syntax
SELECT columns
FROM table1
FULL [OUTER] JOIN table2
ON [Link] = [Link];
Fig 5: full outer join
Q13) Describe views?
A13) In SQL, a view is a virtual table based on the SQL statement
result-set.
A view, much like a real table, includes rows and columns. Fields
in a database view are fields from one or more individual
database tables.
You can add SQL, WHERE, and JOIN statements to a view and
show the details as if the data came from a single table.
Create a view
Syntax
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Up-to-date data still displays a view! Any time a user queries a
view, the database engine recreates the data, using the view's
SQL statement.
Create a view example
A view showing all customers from India is provided by the
following SQL.
CREATE VIEW [India Customers] AS
SELECT CustomerName, ContactName
FROM Customers
WHERE Country = 'India';
Updating a view
With the Build OR REPLACE VIEW command, the view can be
changed.
Create or replace view syntax
CREATE OR REPLACE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
For the "City" view, the following SQL adds the "India Customers"
column:
CREATE OR REPLACE VIEW [India Customers] AS
SELECT CustomerName, ContactName, City
FROM Customers
WHERE Country = 'India';
Dropping a view
With the DROP VIEW instruction, a view is removed.
Drop view syntax
DROP VIEW view_name;
"The following SQL drops the view of "India Customers":
DROP VIEW [India Customers];
Q14) Define transaction control commands?
A14) TCL commands, or transaction control language, deal with
database transactions.
Commit
All transactions are saved to the database with this command.
The COMMIT command is used to save any transaction into the
database permanently.
When we use a DML command like INSERT, UPDATE, or DELETE,
the changes we make are temporary and can be rolled back until
the current session is closed.
To avoid this, we use the COMMIT command to permanently
indicate the modifications.
Syntax
COMMIT;
Example
DELETE FROM Students
WHERE RollNo =25;
COMMIT;
Rollback
You can use the rollback command to undo transactions that
haven't yet been stored to the database.
This command returns the database to the state it was in when it
was last committed. It can also be used in conjunction with the
SAVEPOINT command to go to a specific savepoint in a running
transaction.
If we used the UPDATE command to make changes to the
database that we later realized were unnecessary, we can use the
ROLLBACK command to undo those changes if they were not
committed using the COMMIT command.
Syntax
ROLLBACK;
Example
DELETE FROM Students
WHERE RollNo =25;
Q15) What is the cursors?
A15) Oracle produces a memory space called the context area
when a SQL statement is processed. The context region is shown
by a cursor. It includes all of the data required to process the
statement. Cursor in PL/SQL is in charge of the context area. A
cursor keeps track of the rows of data accessed by a select
statement.
A cursor is a software that fetches and processes each row
returned by a SQL statement one by one. Cursors come in two
varieties:
● Implicit Cursors
● Explicit Cursors
Implicit cursors
When a SQL statement is executed and there is no explicit cursor
for the statement, Oracle automatically creates implicit cursors.
Programmers have no control over implicit cursors or the data
they contain.
When a DML statement (INSERT, UPDATE, or DELETE) is issued, it
is accompanied by an implicit cursor. The cursor holds the data
that needs to be inserted for INSERT operations. The cursor
identifies the rows that will be affected by UPDATE and DELETE
actions.
The most recent implicit cursor is known as the SQL cursor in
PL/SQL, and it always has attributes like %FOUND, % ISOPEN,
%NOTFOUND, and %ROWCOUNT. For use with the FORALL
statement, the SQL cursor has two additional attributes: %BULK
ROWCOUNT and % BULK EXCEPTIONS. The most commonly used
properties are described in the table below.
Attribute Description
%FOUND If DML actions like INSERT, DELETE, and UPDATE
change at least one row or more rows, or if a
SELECT INTO statement returned one or more
rows, the return value is TRUE. Otherwise, FALSE
is returned.
%NOTFOUND If DML statements like INSERT, DELETE, and
UPDATE have no effect on any rows, or if a SELECT
INTO statement returns no rows, it returns TRUE.
Otherwise, FALSE is returned. It's the polar
opposite of % FOUND.
%ISOPEN Because the SQL cursor is automatically closed
after processing its related SQL statements, it
always returns FALSE for implicit cursors.
%ROWCOUN It calculates the number of rows affected by DML
T statements such as INSERT, DELETE, and UPDATE,
as well as the number of rows returned by a
SELECT INTO command.
Example
Create a customers table with the following records:
I NAME AGE ADDRESS SALAR
D Y
1 Ramesh 23 Allahabad 20000
2 Suresh 22 Kanpur 22000
3 Mahesh 24 Ghaziaba 24000
d
4 Chanda 25 Noida 26000
n
5 Alex 21 Paris 28000
6 Sunita 20 Delhi 30000
To update the table and boost each customer's salary by 5000,
run the following software. The number of rows affected is
determined using the SQL % ROWCOUNT attribute:
Create procedure:
DECLARE
Total_rows number(2);
BEGIN
UPDATE customers
SET salary = salary + 5000;
IF sql%notfound THEN
Dbms_output.put_line('no customers updated');
ELSIF sql%found THEN
Total_rows := sql%rowcount;
Dbms_output.put_line( total_rows || ' customers updated ');
END IF;
END;
Output:
6 customers updated
PL/SQL procedure successfully completed.
If you look at the records in the customer table now, you'll notice
that the rows have been changed.
Select * from customers;
I NAME AGE ADDRESS SALAR
D Y
1 Ramesh 23 Allahabad 25000
2 Suresh 22 Kanpur 27000
3 Mahesh 24 Ghaziaba 29000
d
4 Chanda 25 Noida 31000
n
5 Alex 21 Paris 33000
6 Sunita 20 Delhi 35000
Explicit cursors
Explicit cursors are cursors that have been programmed to give
the user more control over the context area. In the PL/SQL Block's
declaration section, an explicit cursor should be defined. It's
based on a SELECT statement that returns multiple rows.
Creating an explicit cursor has the following syntax.
CURSOR cursor_name IS select_statement;
The steps for working with an explicit cursor are as follows:
● Declaring the cursor for memory initialization.
● Opening the cursor for allocating the memory.
● Fetching the cursor for retrieving the data.
● Closing the cursor to release the allocated memory.
Declaring the cursors
Declaring the cursor gives it a name and the SELECT statement
that goes with it. For instance,
CURSOR c_customers IS
SELECT id, name, address FROM customers;
Opening the cursors
The cursor's memory is allocated when it is opened, and it is
ready to receive the rows produced by the SQL statement. For
instance, let's open the above-mentioned cursor as follows:
OPEN c_customers;
Fetching the cursors
The cursor is retrieved by accessing one row at a time. For
instance, we can get rows from the above-opened cursor by doing
the following:
FETCH c_customers INTO c_id, c_name, c_addr;
Closing the cursors
The allocated memory is released when the cursor is closed. For
example, the above-opened cursor will be closed as follows:
CLOSE c_customers;
Example
Programmers define explicit cursors to have additional control
over the context area. It's defined in the PL/SQL block's
declaration section. It's based on a SELECT statement that returns
multiple rows.
Let's look at an example of how to use explicit cursor. In this
example, we'll use the CUSTOMERS table, which has already been
established.
Create a customers table with the following records:
I NAME AGE ADDRESS SALAR
D Y
1 Ramesh 23 Allahabad 20000
2 Suresh 22 Kanpur 22000
3 Mahesh 24 Ghaziaba 24000
d
4 Chanda 25 Noida 26000
n
5 Alex 21 Paris 28000
6 Sunita 20 Delhi 30000
Create procedure:
To get the customer's name and address, run the following
software.
DECLARE
c_id [Link]%type;
c_name [Link]%type;
c_addr [Link]%type;
CURSOR c_customers is
SELECT id, name, address FROM customers;
BEGIN
OPEN c_customers;
LOOP
FETCH c_customers into c_id, c_name, c_addr;
EXIT WHEN c_customers%notfound;
Dbms_output.put_line(c_id || ' ' || c_name || ' ' || c_addr);
END LOOP;
CLOSE c_customers;
END;
Output
1 Ramesh Allahabad
2 Suresh Kanpur
3 Mahesh Ghaziabad
4 Chandan Noida
5 Alex Paris
6 Sunita Delhi
PL/SQL procedure successfully completed.
Q16) Describe stored procedures?
A16) In PL/SQL, a Procedure is a subprogram unit made up of a
collection of PL/SQL statements that can be invoked by name.
Each procedure in PL/SQL has its own distinct name that can be
used to refer to and invoke it. The Oracle database stores this
subprogram unit as a database object.
A subprogram is nothing more than a process that must be
manually constructed according to the requirements. They will be
saved as database objects once they have been generated.
In PL/SQL, the following are the features of the Procedure
subprogram unit:
● Procedures are individual software blocks that can be saved
in a database.
● To execute the PL/SQL statements, call these PL/SQL
procedures by referring to their names.
● In PL/SQL, it's mostly used to run a process.
● It can be defined and nested inside other blocks or packages,
or it can have nested blocks.
● It consists of three parts: declaration (optional), execution,
and exception handling (optional).
● The values can be supplied into or retrieved from an Oracle
process using parameters.
● The calling statement should include these parameters.
● In SQL, a procedure can have a RETURN statement to return
control to the caller block, but the RETURN statement cannot
return any values.
● SELECT statements cannot directly invoke procedures. They
can be accessed via the EXEC keyword or from another block.
Syntax
CREATE OR REPLACE PROCEDURE
<procedure_name>
<parameterl IN/OUT <datatype>
..
[ IS | AS ]
<declaration_part>
BEGIN
<execution part>
EXCEPTION
<exception handling part>
END;
● Construct PROCEDURE tells the compiler to create a new
Oracle procedure. The keyword 'OR REPLACE' tells the compiler to
replace any existing procedures with this one.
● The name of the procedure should be unique.
● When a stored procedure in Oracle is nested inside other
blocks, the keyword 'IS' will be used. 'AS' will be used if the
procedure is solo. Both have the same meaning aside from the
coding standard.
Q17) What do you mean by stored function?
A17) Functions is a PL/SQL subprogram that runs on its own.
Functions, like PL/SQL procedures, have a unique name by which
they can be identified. These are saved as database objects in
PL/SQL.
Some of the qualities of functions are listed below.
● Functions are a type of standalone block that is mostly used
to do calculations.
● The value is returned using the RETURN keyword, and the
datatype is defined at the time of creation.
● Return is required in functions since they must either return a
value or raise an exception.
● Functions that do not require DML statements can be called
directly from SELECT queries, whereas functions that require DML
operations can only be invoked from other PL/SQL blocks.
● It can be defined and nested inside other blocks or packages,
or it can have nested blocks.
● It consists of three parts: declaration (optional), execution,
and exception handling (optional).
● The parameters can be used to pass values into the function
or to retrieve values from the process.
● The calling statement should include these parameters.
● In addition to utilizing RETURN, a PLSQL function can return
the value using OUT parameters.
● Because it always returns the value, it is always used in
conjunction with the assignment operator to populate the
variables in the calling statement.
Syntax
CREATE OR REPLACE FUNCTION
<procedure_name>
<parameterl IN/OUT <datatype>
RETURN <datatype>
[ IS | AS ]
<declaration_part>
BEGIN
<execution part>
EXCEPTION
<exception handling part>
END;
● The command CREATE FUNCTION tells the compiler to
construct a new function. The keyword 'OR REPLACE' tells the
compiler to replace any existing functions with the new one.
● The name of the function should be unique.
● It's important to mention the RETURN datatype.
● When the method is nested inside other blocks, the keyword
'IS' will be utilized. 'AS' will be used if the procedure is solo. Both
have the same meaning aside from the coding standard.
Q18) Describe database trigger?
A18) When a defined event occurs, the Oracle engine
immediately calls the trigger. Triggers are stored in databases
and are called repeatedly when certain conditions are met.
Triggers are stored programs that are executed or fired
automatically when a certain event happens.
Triggers can be written to respond to any of the events listed
below.
● A database manipulation (DML) statement (DELETE, INSERT,
or UPDATE).
● A database definition (DDL) statement (CREATE, ALTER, or
DROP).
● A database operation (SERVERERROR, LOGON, LOGOFF,
STARTUP, or SHUTDOWN).
Triggers can be set on the table, view, schema, or database
connected with the event.
Advantages
The following are some of the benefits of Triggers:
Trigger automatically creates some derived column values.
Referential integrity is enforced.
Information on table access is logged and stored in an event log.
Auditing.
Tables are replicated in a synchronous manner.
Putting in place security authorizations.
Preventing transactions that aren't valid.
Creating a trigger
Syntax for creating trigger:
CREATE [OR REPLACE ] TRIGGER trigger_name
{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;
Where,
CREATE [OR REPLACE] TRIGGER trigger_name: With the trigger
name, it creates or replaces an existing trigger.
{BEFORE | AFTER | INSTEAD OF} : This determines when the
trigger will be activated.
The INSTEAD OF clause is used to make a view trigger.
{INSERT [OR] | UPDATE [OR] | DELETE}: The DML operation is
specified here.
[OF col_name]: This is the name of the column that will be
modified.
[ON table_name]: The name of the table linked with the trigger is
specified here.
[REFERENCING OLD AS o NEW AS n]: This allows you to refer to
new and old values for INSERT, UPDATE, and DELETE DML
statements.
[FOR EACH ROW]: This is a row-level trigger, which means it will
be executed for each row that is affected. Otherwise, the trigger,
which is known as a table level trigger, will only activate once
when the SQL query is executed.
WHEN (condition): This provides a condition for rows for which the
trigger would fire. This clause is valid only for row level triggers.
Example
We'll begin by looking at the CUSTOMERS table -
Select * from customers;
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
The following application sets a row-level trigger for the
customers table that fires when the CUSTOMERS table is
INSERTED, UPDATED, or DELETED. This trigger will show the
difference in salary between the old and new values.
CREATE OR REPLACE TRIGGER display_salary_changes
BEFORE DELETE OR INSERT OR UPDATE ON customers
FOR EACH ROW
WHEN ([Link] > 0)
DECLARE
Sal_diff number;
BEGIN
Sal_diff := :[Link] - :[Link];
Dbms_output.put_line('Old salary: ' || :[Link]);
Dbms_output.put_line('New salary: ' || :[Link]);
Dbms_output.put_line('Salary difference: ' || sal_diff);
END;
/
When the preceding code is run at the SQL prompt, the following
is the result:
Trigger created.
The following considerations must be made in this case:
● For table-level triggers, OLD and NEW references are not
available; however, record-level triggers can use them.
● If you want to query the table within the same trigger, use
the AFTER keyword, because triggers can only query or alter the
table after the original modifications have been implemented and
the table has returned to a consistent state.
● The above trigger will fire before every DELETE, INSERT, or
UPDATE action on the table, but you may create your trigger to
fire before a single or many operations, such as BEFORE DELETE,
which will fire whenever a record is deleted on the table using the
DELETE operation.
Triggering a Trigger
Let's use the CUSTOMERS table to conduct some DML operations.
Here's an example of an INSERT statement that will add a new
record to the table:
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (7, 'Kriti', 22, 'HP', 7500.00 );
When a record in the CUSTOMERS database is generated, the
above create trigger, display salary changes, is triggered, and the
following result is displayed:
Old salary:
New salary: 7500
Salary difference:
Because this is a new record, the previous salary is unavailable,
hence the result above is nil. Let's do another DML transaction on
the CUSTOMERS table now. The UPDATE statement will update a
table record that already exists.
UPDATE customers
SET salary = salary + 500
WHERE id = 2;
When a record in the CUSTOMERS database is updated, the
aforementioned create trigger, display salary changes, is
triggered, and the following result is displayed:
Old salary: 1500
New salary: 2000
Salary difference: 500
DBM
Unit - 4
Database Transactions Management
Q1) What is a transaction?
A1) A transaction is a logical unit of data processing that consists
of a collection of database actions. One or more database
operations, such as insert, remove, update, or retrieve data, are
executed in a transaction. It is an atomic process that is either
completed completely or not at all. A read-only transaction is one
that involves simply data retrieval and no data updating.
A transaction is a group of operations that are logically related. It
consists of a number of tasks. An activity or series of acts is
referred to as a transaction. It is carried out by a single user to
perform operations on the database's contents.
Each high-level operation can be broken down into several lower-
level jobs or operations. A data update operation, for example,
can be separated into three tasks:
Read_item() - retrieves a data item from storage and stores it in
the main memory.
Modify_item() - Change the item's value in the main memory.
Write_item() - write the changed value to storage from main
memory
Only the read item() and write item() methods have access to the
database. Similarly, read and write are the fundamental database
operations for all transactions.
Q2) Write about transaction management?
A2) A transaction is a group of operations that are logically
related. If you were to transfer money from your bank account to
a friend's account, the following actions would be performed:
Simple transaction example
1. Check the balance of your account
2. Subtract the money from your account.
3. Transfer the money to your account.
4. Check the balance of your friend's account.
5. Transfer the funds to his bank account.
6. Update his account with the new balance.
A transaction refers to the entire set of operations. Although I
demonstrated read, write, and update actions in the above
example, the transaction can also include read, write, insert,
update, and delete operations.
Transaction failure in between operations
The major issue that can arise during a transaction is that it may
fail before all of the actions in the set have been completed. This
can happen as a result of a power outage, a system crash, or
other factors. This is a major issue that could result in database
inconsistency. If the transaction fails after the third operation (as
in the example above), the money will be debited from your
account, but it will not be sent to your buddy.
The following two operations can be used to fix this problem.
Commit - If all of the activities in a transaction are successful,
the changes are permanently committed to the database.
Rollback - If any of the operations fails, undo all of the prior
operations' changes.
Even while these activities can assist us prevent a variety of
problems that can develop during a transaction, they are
insufficient when two transactions are running at the same time.
To solve these issues, we need to know about database ACID
properties.
Uses of transaction management
● The DBMS is used to plan concurrent data access. It means
the user can access many pieces of data from the database
without interfering with one another. Concurrency is managed
through transactions.
● It's also utilized to meet ACID requirements.
● It's a tool for resolving Read/Write Conflicts.
● Recoverability, Serializability, and Cascading are all
implemented using it.
● Concurrency Control Protocols and data locking are also
handled by Transaction Management.
Q3) What are the properties of transaction?
A3) A transaction in a database system must maintain Atomicity,
Consistency, Isolation, and Durability − commonly known as ACID
properties − in order to ensure accuracy,
Completeness, and data integrity.
● Atomicity
● Consistency
● Isolation
● Durability.
Atomicity: -
● A transaction is an atomic unit of processing.
● It is either performed in its entirely or not performed at all.
● It means either all operations are reflected in the database or
none are.
● It is responsibility of the transaction recovery subsystem to
ensure atomicity.
● If a transaction fails to complete for some reason, recovery
technique must undo any effects of the transaction on the
database.
Consistency: -
● A transaction is consistency preserving if its complete
execution takes the database from one consistent state to other.
● It is responsibility of the programmer who writes database
programs.
● Database programs should be written in such a way that, if
database is in a consistent state before execution of transaction,
it will be in a consistent state after execution of transaction.
Eg. Transaction is :- Transfer $50 from account A to account B.
If account A having amount $100 & account B having amount $
200 initially then after execution of transaction, A must have $ 50
& B must have $ 250.
Isolation: -
● If multiple transactions execute concurrently, they should
produce the result as if they are working serially.
● It is enforced by concurrency control subsystem.
● To avoid the problem of concurrent execution, transaction
should be executed in isolation.
Durability: -
● The changes applied to the database, by a committed
transaction must persist in a database.
● The changes must not be lost because of any failure.
● It is the responsibility of the recovery subsystem of the DBMS.
Q4) Explain schedule?
A4) A schedule S of n transactions T1, T2, Tn is an ordering of the
operations of the
Transactions, subject to the constraint that, for each transaction
Ti, appearing in S, th
Operations of Ti in they occur in original Ti. So, the execution of
sequences which
Represents the chronological order in which instructions are
executed in the system, are called schedules.
E.g.
T1 T2
Read(A)
A:= A+100;
Write(A);
Read(A)
A := A-200;
Write(A).
* Schedule S *
Schedule S is serial because T1 & T2 are executing serially,
T2 followed by T1.
If initial value of A = 500, then after successful execution of T1
& T2, A should have value =400.
Now, consider above transactions T1 & T2 are running
concurrently like: -
A = 500
T1 T2
Read(A)
Read(A)
(A=500)
(A=500)
A: = A+100;
A:
= A-200;
(A=600)
(A=300)
Write(A);
Write(A);
(A=600)
(A=300)
*concurrent schedule S*
Here, instead of 400, schedule produces value of A=300. So, in
this case database is in inconsistent state.
Conflict Operations: -
Two operations are said to be conflict if: -
i. They are from different transactions
Ii. They access same data item.
Iii. One of the operations is write operation.
Eg:
T1 T2
1. R(A)
2. W(A)
3. R(A)
4. W(A)
5. R(A)
6. R(B)
7.R(A)
8.W(B)
9. R(A)
Here, instructions: - 2 & 3, 4 & 5 ,6 & 8, 4& 9 are conflict
instructions.
Instructions 8 & 9 are non-conflict because they are working on
different data items.
Q5) What is the serial schedule?
A5) In the Serial Schedule, a transaction is completely completed
until another transaction begins to be executed. In other words,
you might assume that a transaction does not start execution in
the serial schedule until the currently running transaction is
executed. A non-interleaved execution is also known as this form
of transaction execution.
Take one example:
R here refers to the operation of reading and W to the operation
of writing. In this case, the execution of transaction T2 does not
start until the completion of transaction T1.
T1 T2
---- ----
R(A)
R(B)
W(A)
Commit
R(B)
R(A)
W(B)
Commit
Q6) Explain conflict serializability?
A6) If a schedule can turn into a serial schedule by exchanging
non-conflicting procedures, it is called conflict serializability.
If the schedule is conflict equivalent to a serial schedule, it will be
conflict serializable.
When two active transactions conduct non-compatible activities in
a schedule with numerous transactions, a conflict develops. When
all three of the following requirements exist at the same time, two
operations are said to be in conflict.
● The two activities are separated by a transaction.
● Both operations refer to the same data object.
● One of the actions is a write item() operation, which attempts
to modify the data item.
Example
Only if S1 and S2 are logically equivalent may they be swapped.
S1 = S2 in this case. This suggests there isn't any conflict.
S1 ≠ S2 in this
case. That
suggests there's a
fight going on.
Conflict
Equivalent
By switching non-conflicting activities, one can be turned into
another in the conflict equivalent. S2 is conflict equal to S1 in the
example (S1 can be converted to S2 by swapping non-conflicting
operations).
If and only if, two schedules are said to be conflict equivalent:
● They both have the identical set of transaction data.
● If each conflict operation pair is ordered the same way.
Example
Schedule S2 is
a serial
schedule since
it completes
all T1
operations before beginning any T2 operations. By exchanging
non-conflicting operations from S1, schedule S1 can be turned
into a serial schedule.
The schedule S1 becomes: after exchanging non-conflict
operations
T1 T2
Read(A)
Write(A Read(A)
)
Write(A
Read(B) )
Write(B Read(B)
)
Write(B
)
Since, S1 is conflict serializable.
Q7) Describe View Serializability?
A7) If a schedule is viewed as equivalent to a serial schedule, it
will be serializable.
If a schedule is view serializable, it is also conflict serializable.
Blind writes are contained in the view serializable, which does not
conflict with serializable.
View Equivalent
If two schedules S1 and S2 meet the following criteria, they are
said to be view equivalent:
1. Initial Read
Both schedules must be read at the same time. Assume there are
two schedules, S1 and S2. If a transaction T1 in schedule S1 reads
data item A, then transaction T1 in schedule S2 should likewise
read A.
Because T1 performs
the initial read
operation in S1 and
S2, the two
schedules are
considered comparable.
2. Updated Read
If Ti is reading A, which is updated by Tj in schedule S1, Ti should
also read A, which is updated by Tj in schedule S2.
Because T3 in S1 is reading A updated by T2 and T3 in S2 is
reading A updated by T1, the two schedules are not equivalent.
3. Final Write
The final writing for both schedules must be the same. If a
transaction T1 updates A last in schedule S1, T1 should also
perform final writes operations in schedule S2.
Because T3
performs the
final write
operation in
S1 and T3 performs the final write operation in S2, the two
schedules are considered equivalent.
Example
Schedule S
The total number of possible
schedules grows to three with three transactions.
= 3! = 6
S1 = <T1 T2 T3>
S2 = <T1 T3 T2>
S3 = <T2 T3 T1>
S4 = <T2 T1 T3>
S5 = <T3 T1 T2>
S6 = <T3 T2 T1>
Taking first schedule S1:
Schedule S1:
Step 1 - final updation on data items
There is no read besides the initial read in both schedules S and
S1, thus we don't need to check that condition.
Step 2 - Initial Read
T1 performs the initial read operation in S, and T1 also performs it
in S1.
Step 3 - Final Write
T3 performs the final write operation in S, and T3 also performs it
in S1. As a result, S and S1 are viewed as equivalent.
We don't need to verify another schedule because the first one,
S1, meets all three criteria.
As a result, the comparable serial schedule is as follows:
T1 → T2 → T3
Q8) What is Recoverable schedules?
A8) Due to a software fault, system crash, or hardware
malfunction, a transaction may not finish completely. The
unsuccessful transaction must be rolled back in this scenario.
However, the value generated by the unsuccessful transaction
may have been utilised by another transaction. As a result, we
must also rollback those transactions. Table displays a schedule
with two transactions.
T1 reads and
writes the
value of A,
which is read
and written by
T2.T2 commits, but T1 fails later. As a result of the failure, we
must rollback T1.T2 should also be rolled back because it reads
the value written by T1, but T2 can't because it has already
committed.
Irrecoverable schedule: If Tj receives the revised value of Ti
and commits before Ti commits, the schedule will be irreversible.
A schedule
with two
transactions is
shown in table
2 above.
Transaction T1 reads and writes A, and transaction T2 reads and
writes that value. T1 eventually fails. As a result, T1 must be
rolled back. Because T2 has read the value written by T1, T2
should be rolled back. We can rewind transaction T2 as well
because it hasn't been committed before T1. As a result, cascade
rollback can be used to recover it.
Recoverable with cascading rollback: If Tj reads the modified
value of Ti, the schedule can be recovered using cascading
rollback. Tj's commit is postponed till Ti's is completed.
A schedule
with two
transactions is
shown in
Table 3 above. T1 commits after reading and writing A, and T2
reads and writes that value. As a result, this is a less recoverable
cascade schedule.
Q9) What are Non - recoverable schedules?
A15) Non - recoverable schedules
T1
T2
R(A)
W(A)
R(X)
Commit
R(B)
● Here, T2 commits immediately after R(A).
● If after T2 commit, because of some reason suppose T1 fails
before it commits.
● T1 must be restarted. As T2 reads the value of X, written by
T1, T2 must be restarted to ensure the property of atomicity.
● But as T2 has committed, it is not possible to abort &
restart.
● So, it is impossible to recover from the failure of T1.
So, the schedule is not recoverable i.e. T1 is non- recoverable
schedule.
E.g.
T1 T2
r(x)
w(x)
r(X)
r(y)
w(x)
Commit;
Commit;
Schedule is non-recoverable because T2 reads X which is
modified by T1. But T2 commits before T1 commits.
Q10) Explain Locking methods?
A10) Locking methods
i. A lock is a variable associated with a data item which describes
the status of the item with respect to possible operations that can
be applied to it.
Ii. Generally, there is one lock for each data item in the database.
Iii. Locks are used as a means of synchronizing the access by
concurrent transactions to the database items.
Locking Models: -
There are two models in which a data item may be locked: -
a. Shared (Read) Lock Mode &
b. Exclusive (write) Lock Mode
a. Shared-Lock Mode: -
● If a transaction Ti has obtained a shared mode lock on item Q,
then Ti can read Q but cannot modify/write Q
● Shared lock is also called as a Read-lock because other
transactions are allowed to read the item.
● It is denoted by s.
Eg. T1 T2
Locks(A);
Locks(B);
T1 has locked A in shared mode & T2 also has locked B in shared
lock mode.
b. Exclusive-Lock Mode: -
● If a transaction Ti has obtained a exclusive-lock mode on item
Q, then Ti can read & write Q
● Exclusive –lock is also called a write-lock because a single
transaction exclusively holds the locks on the item
● It is denoted by X.
T1 T2
LockX(A)
LockX(B);
● T1 has locked A in exclusive-mode & T2 has locked B in
exclusive mode.
Q11) What is the deadline handline?
A11) A deadlock occurs when two or more transactions are stuck
waiting for one another to release locks indefinitely. Deadlock is
regarded to be one of the most dreaded DBMS difficulties because
no work is ever completed and remains in a waiting state
indefinitely.
For example, transaction T1 has a lock on some entries in the
student table and needs to update some rows in the grade table.
At the same time, transaction T2 is holding locks on some rows in
the grade table and needs to update the rows in the Student table
that Transaction T1 is holding.
The primary issue now arises. T1 is currently waiting for T2 to
release its lock, and T2 is currently waiting for T1 to release its
lock. All actions come to a complete halt and remain in this state.
It will remain at a halt until the database management system
(DBMS) recognizes the deadlock and aborts one of the
transactions.
Fig: Deadlock
Deadlock Avoidance
When a database becomes
stuck in a deadlock condition, it is preferable to avoid the
database rather than abort or restate it. This is a waste of both
time and money.
Any deadlock condition is detected in advance using a deadlock
avoidance strategy. For detecting deadlocks, a method such as
"wait for graph" is employed, although this method is only
effective for smaller databases. A deadlock prevention strategy
can be utilized for larger databases.
Deadlock Detection
When a transaction in a database waits endlessly for a lock, the
database management system (DBMS) should determine whether
the transaction is in a deadlock or not. The lock management
keeps a Wait for the graph to detect the database's deadlock
cycle.
Wait for Graph
This is the best way for detecting deadlock. This method
generates a graph based on the transaction and its lock. There is
a deadlock if the constructed graph has a cycle or closed loop.
The system maintains a wait for the graph for each transaction
that is awaiting data from another transaction. If there is a cycle
in the graph, the system will keep checking it.
Fig: Wait for a graph
Deadlock Prevention
A huge database can benefit
from a deadlock prevention strategy. Deadlock can be avoided if
resources are distributed in such a way that deadlock never
arises.
The database management system examines the transaction's
operations to see if they are likely to cause a deadlock. If they do,
the database management system (DBMS) never allows that
transaction to be completed.
Q12) Describe Time stamping Methods?
A12) Time stamping methods
● Time stamp is a unique identifier created by the DBMS to
identify a transaction.
● These values are assigned, in the order in which the
transactions are submitted to the system, so timestamp can be
thought of as the transaction start time.
● It is denoted as Ts(T).
● Note: - Concurrency control techniques based on timestamp
ordering do not use locks, here deadlock cannot occur.
Generation of timestamp: -
Timestamps can be generated in several ways like: -
● Use a counter that is incremental each time its value is
assigned to a transaction. Transaction timestamps are numbered
1,2, 3, in this scheme. The system must periodically reset the
counter to zero when no transactions are executing for some
short period of time.
● Another way is to use the current date/time of the system
lock.
● Two timestamp values are associated with each database
item X: -
1) Read-Ts(X): -
i. Read timestamp of X.
[Link] –Ts(X) = Ts(T)
Where T is the youngest transaction that has read X successfully.
2) Write-Ts(X): -
i. Write timestamp of X.
[Link]-Ts(X) = Ts(T),
Where T is the youngest transaction that has written X
successfully.
These timestamps are updated whenever a new read(X) or
write(X) instruction is executed.
Timestamp Ordering Algorithm: -
a. Whenever some transaction T issues a Read(X) or Write(X)
operation, the algorithm compares Ts(T) with Read-Ts(X) &
Write-Ts(X) to ensure that the timestamp order of execution is not
violated.
b. If violate, transaction is aborted & resubmitted to system as a
new transaction with new timestamp.
c. If T is aborted & rolled back, any transaction T1 that may
have used a value written by T must also be rolled back.
d. Similarly, any transaction T2 that may have used a value
written by T1 must also be rolled back & so on.
So, cascading rollback problems is associated with this algorithm.
Algorithm: -
1) If transaction T issues a written(X) operation: -
a) If Read-Ts(X) > Ts(T)
OR
Write-Ts(X) > Ts(T)
Then abort & rollback T & reject the operation.
This is done because some younger transaction with a timestamp
greater than Ts(T) has already read/written the value of X before
T had a chance to write(X), thus violating timestamp ordering.
b) If the condition in (a) does not occur then execute write (X)
operation of T & set write-Ts(X) to Ts(T).
2) If a transaction T issues a read(X) operation.
a. If write-Ts(X) > Ts(T), then abort & Rollback T &
reject the operation.
This should be done because some younger transaction with
timestamp greater than Ts(T) has already written the value of X
before T had a chance to read X.
b. If Write-Ts(X) ≤ Ts(T), then execute the read(X) operation of T
& set read-Ts(X) to the larger of Ts(T) & current read-
Ts(X).
Q13) Write the advantages and disadvantages of time
stamping?
A13) Advantages: -
1. Timestamp ordering protocol ensures conflict serializability.
This is because conflicting order.
2. The protocol ensures freedom from deadlock, since no
transaction ever waits.
Disadvantages: -
1. There is a possibility of starvation of long transactions if a
sequence of conflicting short transaction causes repeated
restarting of long transaction.
2. The protocol can generate schedules that are not recoverable.
Q14) A two phase locking protocol variant that required
that all locks be held until the transaction commit is called
1. Lock-point two-phase locking protocol
2. Deadlock two-phase locking protocol
3. Strict two-phase locking protocol
4. Rigorous two-phase locking protocol
A14) Correct answer is “ D ” .
Q15) How to recover from transaction failure?
A15) When a transaction fails to execute or reaches a point
where it can't go any further, it is called a transaction failure.
Transaction failure occurs when a number of transactions or
processes are harmed.
Logical errors: When a transaction cannot complete owing to a
coding error or an internal error situation, a logical error occurs.
Syntax error: When a database management system (DBMS)
stops an active transaction because the database system is
unable to perform it, this is known as a syntax error. For example,
in the event of a deadlock or resource unavailability, the system
aborts an active transaction.
When a system crashes, numerous transactions may be running
and various files may be open, allowing them to modify data
items. Transactions are made up of a variety of operations that
are all atomic. However, atomicity of transactions as a whole
must be preserved, which means that either all or none of the
operations must be done, depending on the ACID features of
DBMS.
When a database management system (DBMS) recovers after a
crash -
● It should check the statuses of all the transactions that were
in progress.
● A transaction may be in the middle of an operation; in this
situation, the DBMS must ensure the transaction's atomicity.
● It should determine if the transaction can be completed at
this time or whether it has to be turned back.
● There would be no transactions that would leave the DBMS in
an inconsistent state.
Maintaining the logs of each transaction and writing them onto
some stable storage before actually altering the database are two
strategies that can assist a DBMS in recovering and maintaining
the atomicity of a transaction.
There are two sorts of approaches that can assist a DBMS in
recovering and sustaining transaction atomicity.
● Maintaining shadow paging, in which updates are made in
volatile memory and the actual database is updated afterwards.
● Before actually altering the database, keep track of each
transaction's logs and save them somewhere safe.
Q16) How are transactions implemented in the database?
A16) Even the simple update of one record contain these
substeps within the system:
1. Locate the record to be updated from secondary storage
2. Transfer the block disk into the memory buffer
3. Make the update to tuple in the buffer buffer
4. Write the modified block back out to disk
5. Make an entry to a log
More complicated transactions may involve several separate SQL
updates
Between each step is an opportunity that another user or different
transaction step could occur
Also....the modified buffer block may not be written to disk
immediately after transaction terminates. We must assume there
is a delay before actual write is completed , e.g., the block
remains in the disk cache.
Q17) Explain lock compatibility matrix?
A17) Lock-Compatibility Matrix: -
S X
S ✓X
XX X
● If a transaction Ti has locked Q in shared mode, then another
transaction Tj can lock Q in shared mode only. This is called as
composite mode.
● If a transaction Ti has locked Q in S mode, another
transaction Tj can’t lock Q in X mode. This is called incompatible
mode.
● Similarly, if a transaction Ti has locked Q in X mode, then
other transaction Tj can’t lock Q in either S or X mode.
● To access, a data item, transaction Ti must first lock that
item.
● If the data item is already locked by another transaction in an
incompatible mode, the concurrency control manager will not
grant the lock until lock has been released.
● Thus, Ti is mode to wait until all incompatible locks have been
released.
● A transaction can unlock a data item by unlock (Q)instruction.
Eg.1. Transaction T1 displays total amount of money in accounts
A & B.
Without-locking modes: -
T1
R(A)
R(B)
Display(A+B)
With-locking modes: -
T1
Locks(A)
R(A) unlock(A)
Locks(B)
R(B) unlock(B)
Display(A+B)
T1 locks A & B in shared mode because T1 uses A & B only for
reading purposes.
Q18) Describe two phase locking protocols?
A18) Two-Phase Locking Protocol: -
● Each transaction in the system should follow a set of rules,
called locking protocol.
● A transaction is said to follow the two-phase locking protocol,
if all locking operation (shared, exclusive protocol) precede the
first unlock operation.
● Such a transaction can be divided into two phases: Growing
& shrinking phase.
● During Expanding / Growing phase, new locks on items can
be acquired but no one can be released.
● During a shrinking phase, existing locks can be released but
no new locks can be acquired.
● If every transaction in a schedule follows the two-phase
locking protocol, the schedule is guaranteed to be serialization,
obviating the need to test for Serializability.
● Consider, following schedules S11, S12. S11 does not follow
two-phase locking &S12 follows two-phase locking protocol.
T1
Lock_S(Y)
R(Y)
Unlock(Y)
Lock_X(Q)
R(Q)
Unlock(Q)
Q: = Q +
Y;
W(Q);
Unlock(Q)
This schedule does not follow two phase locking because 3 rd
instruction lock-X(Q) appears after unlock(Y) instruction.
T1
Lock_S(Y)
R(Y)
Lock_X(Q)
R(Q)
Q: = Q +
Y;
W(Q);
Unlock(Q)
;
Unlock(Y);
This schedule follows two-phase locking because all lock
instructions appear before first unlock instruction.
DBM
Unit - 5
Parallel Databases
Q1) Explain multi user database architecture?
A1) The following are the most typical architectures for
implementing multi-user database management systems:
● Teleprocessing
● File-Server
● Client-Server
Teleprocessing
There is a single computer with a single CPU and several
terminals.
Within the same physical computer, processing takes place. User
terminals are often "dumb," unable to function alone, and are
connected to a central computer.
Fig 1: Teleprocessing
File server
● Several workstations are connected to a file server through a
network.
● The database is stored on a file server.
● On each workstation, a database management system
(DBMS) and applications are installed.
● The following are some disadvantages:
○ There is a lot of network traffic.
○ Each workstation has a copy of the DBMS.
○ More difficult to control concurrency, recovery, and integrity.
File server architecture
The processing in a file-server system is distributed over the
network, which is usually a local area network (LAN).
Fig 2: File server
Traditional Two - tier Client Server architecture
The client (tier 1) is in charge of the user interface and the
execution of applications.
The database and database management system (DBMS) are
stored on the server (layer 2).
● Wider access to existing databases;
● Enhanced performance;
● Probable cost savings on hardware;
● Cost savings on communication;
● Increased consistency are just a few of the benefits.
Fig 3: Traditional two-tier client server architecture
Q2) Write a case study on oracle architecture?
A2) At least one database instance and one database make up an
Oracle Database. Memory and processes are managed by the
database instance. The database can be a non-container
database or a multitenant container database, and it is made up
of physical files called data files. During the operation of an
Oracle Database, various database system files are used.
One database instance and one database make up a single-
instance database design. The database and the database
instance have a one-to-one relationship. On the same server
system, many single-instance databases can be installed. For
each database, there are different database instances. This
configuration is handy for running multiple Oracle Database
versions on the same machine.
Oracle Real Application Clusters (Oracle RAC) databases are made
up of many instances that run on different server machines. They
all have access to the same database. On one end, the cluster of
server machines appears as a single server, while on the other
end, end users and applications see it as a single server. This
setup is optimized for excellent availability, scalability, and overall
performance.
The database server process is the listener. It receives client
requests, establishes a database connection, and then passes the
client connection to the server process. The listener can be run
locally or remotely on the database server. Oracle RAC
environments are typically run remotely.
Fig 4: This interactive diagram shows the Oracle Database 18c
technical architecture
In the early days of computers, technology simply improved the
efficiency of manual procedures. New innovations enabled new
capabilities and procedures in the company that were driven by IT
as technology progressed. IT gradually altered the business,
although not always in a way that was in line with the company's
strategy. This misalignment resulted in considerable resource
waste and missed opportunities, as well as putting the company
at a competitive disadvantage in the marketplace. Enterprise
Architecture is a novel strategy to manage IT that has been
designed to connect business and IT strategies.
Enterprise architecture
Enterprise Architecture (EA) is a method and organizational
principle for aligning functional business goals and strategies with
an IT strategy and implementation plan. Enterprise Architecture
serves as a roadmap for organizations' technological progress and
change. As a result, IT becomes a more strategic asset for
achieving a modern corporate strategy.
Typical outcomes from an Enterprise Architecture are:
● Enterprise Architecture Model in Its Current State
● The reference model for Future State Enterprise Architecture
is required to carry out the proposed business strategy.
● Gap analysis that highlights the present state's shortcomings
in terms of its capacity to support the business Architecture
Roadmap's objectives and strategies, as well as the activities
required to migrate from the current to the future state.
An EA ensures that the corporate goals and objectives are
handled holistically across all IT initiatives by taking an
enterprise-wide perspective spanning business services, business
processes, information, applications, and technology.
Enterprise architecture is about continual communication
between business and IT leaders as much as it is about
technological advancements and architectural decisions.
The Oracle Enterprise Architecture Framework
Oracle built a hybrid EA framework influenced by TOGAF, FEA,
and Gartner in order to deliver an efficient, business-driven
framework to help our customers align their IT and business
strategy. The Oracle Enterprise Architecture is a simple yet useful
and prescriptive framework (OEAF). With unambiguous mappings
to TOGAF and FEA, the OEAF complements other EA frameworks,
allowing clients to utilize the EA framework of their choice. The
OEAF was created with the goal of combining the benefits of
several industry frameworks with Oracle's experience in
producing enterprise solutions.
The Oracle Enterprise Application Framework's fundamental
premise is to give "just enough" structure that may be
constructed "just in time" to satisfy the organization's business
requirements. Furthermore, the OEAF provides a well-known
architectural platform for sharing Oracle's considerable
intellectual capital surrounding enterprise IT solutions with
customers and partners, increasing Oracle's strategic business
value proposition.
There are nine essential values in the Oracle Enterprise
Application Framework.
● Business strategy is at the heart of all we do.
● The technical architecture is standardized and simplified.
● Modeling that is "just enough" for enterprise solution
architecture efforts.
● Recycles industrial and commercial suppliers' best-practice
business models and reference architectures.
● For high-level guidance, the first focus is on speed of delivery.
● Collaboration with business owners, stakeholders, and expert
architects was used to create this project.
● For breadth and depth, it is developed repeatedly and evolves
through time.
● It is possible to enforce.
● It is technology agnostic, however it does make use of
Oracle's knowledge and intellectual property.
These concepts serve as the foundation for mapping business
needs to IT execution in agile enterprise architecture.
Components of Oracle Enterprise Architecture Framework
There are seven main components that make up the Oracle
Enterprise Architecture Framework.
Business architecture
Business Architecture should be the starting point for every
architectural debate. The Business Architecture links an
organization's operating model, plans, and objectives with IT, as
well as creating a business case for IT reforms and providing a
functional view of the enterprise.
Fig 5: Oracle Enterprise Architecture Framework Components
Application architecture
In accordance with the application strategy, the Application
Architecture provides an application- and services-centric view of
an organization that relates business operations and services to
application processes and services to application components.
The scope, strategy, and standards of the Application Architecture
are all influenced by the Business Architecture.
Information architecture
The Information Architecture outlines all of the moving
components involved in managing information across the
enterprise and sharing it with the right people at the right time in
order to achieve the business goals outlined in the business
architecture.
Technology Architecture
The Technology Architecture specifies the organization of the
infrastructure that supports the business, application, and
information architectures.
People, Process, and Tools
The people, methods, and technologies utilized to define
enterprise architectures and architecture solutions are identified
in this section of the framework.
People - Enterprise architecture teams and individuals are
charged with a variety of activities, including architectural
creation, management, implementation, and governance.
Process - A set of architectural processes chosen and followed to
drive an architecture engagement down a path that increases the
likelihood of a successful implementation while minimizing
resource expenditure.
Tools - A combination of tools and technology that let you create
and manage enterprise architecture faster. The majority of these
products are in the modeling, portfolio management, and
architecture asset repositories categories (for example, ARIS IT
Architect, Oracle BPA Suite) (for example, Oracle Enterprise
Repository).
Enterprise Architecture Governance
The framework and mechanisms for achieving an organization's
business strategy and objectives through an Enterprise
Architecture are provided by Enterprise Architecture governance.
During IT transformations and solution deployments, an EA
governance body is utilized to guide each project and assure
alignment with the EA.
Q3) Write the performance parameter for a parallel
database?
A3) The following are some criteria for evaluating the
performance of parallel databases:
1. Response time - It is the amount of time it takes to
execute a single task in a particular amount of time.
2. Speed up in the parallel database - The process of
raising the degree of (resources) parallelism in order to
accomplish a running job in less time is known as speeding
up.
The time it takes to complete a task is inversely related to the
quantity of resources available.
Formula:
Speed up = TS / TL
Where,
TS = Time required to execute task of size Q
TL = Time required to execute task of size N*Q
● Linear speed-up is N (Number of resources).
● Speed-up is sub-linear if speed-up is less than N.
Fig 6: Speed up in
parallel database
3. Scale up in the
parallel database
- When the number of
processes and
resources rises
correspondingly,
scale-up refers to the ability to maintain performance.
Formula:
Let Q be the Task and QN the task where N is greater than Q
TS = Execution time of task Q on smaller machine M S
TL = Execution time of task Q on smaller machine M L
Scale Up = TS /TL
Fig 7: Scale up in
parallel database
Q4) Write the types
of parallel database
architecture?
A4) A parallel
database
management system (DBMS) is a database management system
that works on several processors or CPUs and is primarily
designed to execute query operations in parallel whenever
possible. The parallel DBMS connects several smaller machines
together to obtain the same throughput as a single large system.
There are three architectural ideas for parallel DBMS in Parallel
Databases. The following are the details:
● Shared Memory Architecture
● Shared Disk Architecture
● Shared Nothing Architecture
Q5) What is shared memory architecture?
A5) Multiple processors are connected to a global shared memory
via an intercommunication channel or communication bus in a
shared memory system.
Because each processor in a shared memory system has a
considerable quantity of cache memory, referencing the shared
memory is avoided.
When a processor conducts a write operation to a memory
location, the data in that location should be updated or erased.
Fig 8: Shared memory
system in parallel
database
Advantages
● Any processor
can simply access
data.
● One processor can efficiently convey messages to another.
Disadvantages
● Due to the increasing number of processors, the processing
time has increased.
● There's a bandwidth issue.
Q6) Describe shared disk architecture?
A6) A shared disk system has many processors, each of which
has local memory and is connected to numerous disks via an
intercommunication channel.
Because each processor has its own memory, data sharing is
quick.
Clusters are the systems that are constructed around this system.
Fig 9: Shared disk
system in parallel
database
Advantages
● A shared disk
system is used to
achieve fault tolerance.
● If one CPU or its memory fails, the task can be completed by
another processor. Fault tolerance is the term for this.
Disadvantages
● Because a substantial amount of data flows across the
interconnection channel, the scalability of a shared disk system is
constrained.
● Existing processors are delayed as more processors are
added.
Application
Digital Equipment Corporation (DEC): DEC cluster running
relational databases use the shared disk system and now owned
by Oracle.
Q7) What is shared nothing architecture?
A7) In the shared nothing system, each CPU has its own local
memory and disk.
Intercommunication channels allow processors to communicate
with one another.
Any CPU can be used as a server to serve data from a local disk.
Fig 10: Shared
nothing system in
parallel database
Advantages
In a share nothing
disk system, the
number of processors
and disks can be
connected as needed.
The shared nothing disk system can handle several processors,
making it more scalable.
Disadvantages
● In a shared nothing disk system, data partitioning is required.
● Communication costs for accessing a local disk are
substantially higher.
Application
● Database machine with terabytes of data.
● The research prototypes Grace and Gamma.
Q8) Evaluate parallel query in parallel database?
A8) The following are the two methods for evaluating queries:
1. Inter query parallelism
This method allows you to run many queries on several
processors at the same time.
Inter query parallelism is used to achieve pipelined parallelism,
which increases the system's output.
For instance, if there are six queries, each will take three seconds
to evaluate. As a result, the entire evaluation procedure took 18
seconds to complete. This task takes only 3 seconds thanks to
inter query parallelism.
Inter query parallelism, on the other hand, is tough to achieve
every time.
2. Intra Query Parallelism
This approach divides a question into sub queries that can run on
many processors at the same time, reducing query evaluation
time.
Intra-query parallelism enhances the system's response time.
For instance, if we have 6 queries, each of which takes 3 seconds
to evaluate, the total time to finish the assessment process is 18
seconds. However, because each question is separated into sub-
queries, we can complete this operation in under 3 seconds
utilizing intra query evaluation.
Optimization of Parallel Query
● Selecting the most effective query evaluation strategy is what
parallel query optimization is all about.
● Parallel query optimization is critical in the development of
systems that reduce the cost of query evaluation.
Two factors play a very important in parallel query
optimization.
a) the overall amount of time it took to discover the optimal plan.
b) the length of time it will take to carry out the plan.
Query Optimization is done with an aim to:
● Find the queries that can produce the fastest results on
execution to speed up the inquiries.
● Increase the system's performance.
● Choose the optimal query evaluation strategy.
● Avoid the unwelcomed strategy.
Q9) Describe Virtualization on multicore processors?
A9) Virtualization on multicore processors
● By adding more CPUs, the computer's processing power is
boosted. Virtualization is the name given to this method.
● Multiple CPUs are added to the host machine, which improves
the system's performance.
● There are no set guidelines for increasing the number of
CPUs.
● Multicore processors are capable of solving complex
processing problems.
● This method is employed in the processing of big loads.
The diagram below shows a virtualization on a multicore
processor.
Fig 11: Virtualization
on multicore processor
Q10) Write the
advantages of
parallel databases?
A10) The benefits of
the parallel database are explained below −
Speed
Speed is the main advantage of parallel databases. The server
breaks up a request for a user database into parts and sends each
part to a separate computer.
We eventually function on the pieces and combine the outputs,
returning them to the customer. It speeds up most requests for
data so that large databases can be reached more easily.
Capacity
As more users request access to the database, the network
administrators are adding more machines to the parallel server,
increasing their overall capacity.
For example, a parallel database enables a large online store to
have at the same time access to information from thousands of
users. With a single server, this level of performance is not
feasible.
Reliability
Despite the failure of any computer in the cluster, a properly
configured parallel database will continue to work. The database
server senses that there is no response from a single computer
and redirects its function to the other computers.
Many companies, such as online retailers, want their database to
be accessible as fast as possible. This is where a parallel database
stands good.
This method also helps in conducting scheduled maintenance on
a computer-by-computer technician. They send a server
command to uninstall the affected device, then perform the
maintenance and update required.
Q11) Write about parallel database and need also?
A11) Nowadays, businesses must manage a large amount of data
at a rapid transfer rate. The client-server or centralized solution is
ineffective for such needs. The concept of a parallel database
comes into play when the system's efficiency needs to be
improved. A parallel database system uses the parallelizing
principle to improve the system's performance.
A parallel database is one which involves multiple processors and
working in parallel on the database used to provide the services.
A parallel database system seeks to improve performance
through parallelization of various operations like loading data,
building index and evaluating queries parallel systems improve
processing and I/O speeds by using multiple CPU’s and disks in
parallel.
Needs
The usage of several resources, such as CPUs and disks, is done
in parallel. In contrast to serial processing, the processes are
carried out simultaneously. A parallel server allows users on
several workstations to access a single database.
Q12) Write the difference between shared memory and
shared disk architecture?
A12) Difference between shared memory and shared disk
architecture
Shared Memory Architecture Shared Disk Architecture
In shared nothing architecture In shared disk architecture the
the nodes do not share nodes share memory as well as
memory or storage. the storage.
It has fixed load balancing. It has dynamic load balancing.
Here the disks have individual Here the disks have active
nodes which cannot be shared. nodes which are shared in case
of failures.
Its advantage is that it has Its advantage is that it has
high availability. unlimited scalability.
It has cheaper hardware as The hardware in shared disk is
compared to shared disk comparatively expensive.
architecture.
The data is strictly partitioned. The data is not partitioned.
Q13) What are the advantages and disadvantages of shred
memory architecture?
A13 Advantages
● Any processor can simply access data.
● One processor can efficiently convey messages to another.
Disadvantages
● Due to the increasing number of processors, the processing
time has increased.
● There's a bandwidth issue.
Q14) Write the advantages and disadvantages of shared
disk architecture?
A14 Advantages
● A shared disk system is used to achieve fault tolerance.
● If one CPU or its memory fails, the task can be completed by
another processor. Fault tolerance is the term for this.
Disadvantages
● Because a substantial amount of data flows across the
interconnection channel, the scalability of a shared disk system is
constrained.
● Existing processors are delayed as more processors are
added.
Q15) Write the advantages and disadvantages of shared
nothing architecture?
A15) Advantages
In a share nothing disk system, the number of processors and
disks can be connected as needed.
The shared nothing disk system can handle several processors,
making it more scalable.
Disadvantages
● In a shared nothing disk system, data partitioning is required.
● Communication costs for accessing a local disk are
substantially higher.
DBM
Unit - 6
Distributed Databases
Q1) What is a distributed database system?
A1) A distributed database is a database that is not restricted to
a single system and is dispersed across numerous places, such as
multiple computers or a network of computers. A distributed
database system is made up of multiple sites with no physical
components in common. This may be necessary if a database
needs to be viewed by a large number of people all over the
world. It must be administered in such a way that it appears to
users as a single database.
A distributed database system (DDBS) is a database that does not
have all of its storage devices connected to the same CPU. It
might be stored on numerous computers in the same physical
place, or it could be spread throughout a network of connected
computers. Simply said, it is a logically centralized yet physically
dispersed database system. It's a database system and a
computer network all rolled into one. Despite the fact that this is
a major issue in database architecture, one of the most serious
challenges in today's database systems is storage and query in
distributed database systems.
Fig 1: Distributed database
A centralized software system
that manages a distributed
database as if it were all kept in a
single location is known as a
distributed database
management system(DDBMS).
Features
● It's used to create, retrieve, update, and destroy databases
that are distributed.
● It synchronizes the database on a regular basis and provides
access mechanisms, making the distribution transparent to the
users.
● It ensures that data modified on any site is updated across
the board.
● It's utilized in applications where a lot of data is processed
and accessible by a lot of people at the same time.
● It's made to work with a variety of database platforms.
● It protects the databases' confidentiality and data integrity.
Q2) What are the factors that encourage DDBMS?
A2) The following factors support switching to a DDBMS:
● Distributed Nature of Organizational Units - In today's
world, most businesses are separated into many parts that are
physically dispersed across the globe. Each unit necessitates its
own collection of local information. As a result, the organization's
whole database is scattered.
● Need for Sharing of Data - Various organizational units
must frequently communicate with one another and share data
and resources. This necessitates the usage of shared databases
or replicated databases that are synchronized.
● Support for Both OLTP and OLAP - Online Transaction
Processing (OLTP) and Online Analytical Processing (OLAP) are
two systems that operate together to process data. By providing
synchronized data, distributed database systems enhance both of
these processes.
● Database Recovery - Data replication over several sites is
one of the most used DDBMS approaches. If a database on any
site is corrupted, data replication automatically aids in data
recovery. While the broken site is being rebuilt, users can access
data from other sites. As a result, database failure may become
almost imperceptible to users.
● Support for Multiple Application Software - The majority
of businesses employ a range of application software, each with
its own database support. DDBMSs provide standardized
capability for sharing data across platforms.
Q3) Write the advantages of distributed database?
A3) The advantages of distributed databases versus centralized
databases are as follows.
● Modular Development − In centralized database systems, if
the system needs to be expanded to additional locations or units,
the activity necessitates significant effort and disruption of
current operations. In distributed databases, on the other hand,
the process is merely moving new computers and local data to
the new site and then connecting them to the distributed system,
with no interruption in present operations.
● More Reliable − When a database fails, the entire
centralized database system comes to a halt. When a component
fails in a distributed system, however, the system may continue
to function but at a lower level of performance.
● Better Response − If data is delivered efficiently, user
requests can be fulfilled from local data, resulting in a speedier
response. In centralized systems, on the other hand, all inquiries
must transit through the central computer for processing,
lengthening the response time.
● Lower Communication Cost − When data is stored locally
where it is most frequently utilized in distributed database
systems, communication costs for data manipulation can be
reduced. In centralized systems, this is not possible.
Q4) Describe a homogenous distributed database?
A4) Homogeneous Database:
All the different sites store databases identically in a
homogeneous database. For all locations, the operating system,
database management system and the data systems used are all
the same. They're easy to handle, therefore.
Example: Remember that we use Oracle-9i for DBMS to have
three departments. If any improvements in one department were
made, the other department will also be modified.
Fig 2: Homogeneous distributed
system
Types of Homogenous
Distributed Database
A homogeneous distributed database can be divided into two
forms.
● Autonomous - Each database is self-contained and self-
contained. A controlling program integrates them and uses
message passing to share data updates.
● Non - Autonomous - Data is dispersed throughout the
homogenous nodes, and data changes are coordinated across the
sites by a central or master DBMS.
Q5) What is a heterogeneous distributed database?
A5) Heterogeneous Database:
Different sites may use various schemes and applications in a
heterogeneous distributed database that can lead to query
processing and transaction problems. A unique site may also be
totally unaware of the other pages. A separate operating system,
a distinct database programme, may be used by different
computers. For the database, they can also use various data
models. Therefore, for different sites to interact, translations are
needed.
Example: In the following diagram, ODBC and JDBC are used to
render different DBMS applications available to each other.
Fig 3: Heterogeneous distributed
system
Types of Heterogeneous
Distributed Database
● Federated − The heterogeneous database systems are self-
contained in nature but can be linked together to form a single
database system.
● Un-federated − The databases are accessible through a
central coordinating module in database systems.
Q6) Explain the architecture of distributed database?
A6) The following are some of the most common architectural
models:
Client - server Architecture
In a client-server architecture, a network is made up of a number
of clients and a few servers. One of the servers receives a query
from a client. The first accessible server solves the problem and
responds. Because of the centralized server system, a client-
server architecture is straightforward to develop and execute.
The functionality is divided into servers and clients in this two-
level design. Data management, query processing, optimization,
and transaction management are the main server functions. User
interface is one of the most important client tasks. They do,
however, have some capabilities, such as consistency checking
and transaction management.
There are two distinct client-server architectures.
● Single Server Multiple Client
● Multiple Server Multiple Client
Fig 4: Client - server architecture
Collaborating server architecture
● A collaborative server architecture is one that allows a single
query to be executed across numerous servers.
● The result is delivered to the client after the server breaks
down a single query into many tiny requests.
● A set of database servers makes up a collaborative server
architecture. Each server has the ability to execute current
transactions across databases.
Fig 5: Collaborating
server architecture
Middleware
architecture
● Middleware
designs are built so
that a single query can
be processed on
numerous servers.
● Only one server is required for this system, and it must be
capable of coordinating requests and transactions from numerous
servers.
● Local servers are used in middleware architecture to handle
local queries and transactions.
● This type of software is known as middleware, and it is used
to execute queries and transactions across one or more separate
database servers.
Q7) Describe replication?
A7) Replication
The complete relation is stored redundantly at two or more sites
in this manner. It is a fully redundant database if the entire
database is available at all sites. As a result, in replication,
systems keep duplicates of data.
This is useful since it increases data availability across several
places. Query queries can now be executed in parallel as well.
It does, however, have some downsides. Data must be updated
on a regular basis. Any modification made at one site must be
recorded at every site where that relationship is saved, or else
inconsistency would result. This is a significant amount of
overhead. Concurrent access must now be checked across
multiple sites, making concurrency control much more difficult.
Advantages of replication
● Reliability - In the event that one of the sites fails, the
database system continues to function because a copy is
available at another location (s).
● Quicker response - Short query processing and, as a result,
quick response time are ensured by the availability of local copies
of data.
● Reduction in network load - Due to the availability of local
copies of data, query processing can be done with less network
usage, especially during peak hours. Data updates can be
completed during non-peak hours.
Types
A database can be either fully replicated, partially replicated or
unreplicated.
● Full replication - Multiple copies of each database fragment
are stored at different locations. Due to the amount of overhead
put on the system, fully replicated databases may be
impracticable.
● Partial replication - Multiple copies of some database
fragments are stored at various locations. Most DDBMS are
capable of handling this form of replication.
● No replication - Each database fragment is stored at a
single location. There is no duplicate.
Q8) Write about fragmentation?
A8) Fragmentation
The relations are broken (i.e., divided into smaller portions) with
this manner, and each of the fragments is kept in multiple
locations as needed. It must be ensured that the fragments can
be utilized to recreate the original relationship (i.e. that no data is
lost).
Fragmentation is useful since it avoids the creation of duplicate
data, and consistency is not an issue.
Advantages of fragmentation
● The database system's efficiency is improved since data is
stored near to the point of use.
● Because data is available locally, local query optimization
techniques are sufficient for most queries.
● The database system's security and privacy can be
maintained because irrelevant data is not available at the sites.
Types
Relationships can be fragmented in the following ways:
Horizontal fragmentation – Splitting by rows –Each tuple is
assigned to at least one fragment once the relation is broken into
groups of tuples.
Vertical fragmentation – Splitting by columns – The
relation's schema is broken into smaller schemas. To ensure a
lossless join, each fragment must have a common candidate key.
Mixed fragmentation (Hybrid) - This is a two-step process of
fragmentation. Horizontal fragmentation is performed first to
obtain the required rows, followed by vertical fragmentation to
divide the attributes among the rows.
1. Data Allocation
The process of determining where to store data is known as data
allocation. It also necessitates a determination of which data
should be stored where. Data can be allocated centrally,
partitioned, or replicated.
● Centralised - The complete database is kept on a single
server. There is no dispersion.
● Partitioned - The database is partitioned into multiple
fragments, each of which is stored at a different location.
● Replicated - Several locations maintain copies of one or
more database fragments.
Q9) What is distributed data storage?
A9) Distributed Data Store refers to a computer network in which
data is replicated across multiple nodes and stored on multiple
nodes. It can refer to either a distributed database or a computer
network in which users store information on a number of peer
network nodes. Distributed databases are non-relational
databases that allow for rapid data access over a large number of
nodes.
Some distributed databases allow for sophisticated querying,
while others are restricted to key-value store semantics. On the
other side, peer network nodes allow users to reciprocate by
allowing other users to use their computer as a storage node as
well. Depending on the network's design, information may or may
not be visible to other users. Some peer-to-peer networks lack
distributed data stores, which means that a user's data is only
accessible while their node is connected to the network.
When complex tasks are involved, distributed data storage
becomes even more important. This is due to the fact that
complicated tasks necessitate complex networks and take a long
time to operate and implement. The goal of distributed data
storage is to avoid focusing all of your resources on a single job.
Rather, it evenly distributes resources across all channels. The
distributed data storage approach has proven to be more
powerful and resourceful than stand-alone systems based on
previous observations.
A typical gaming system network is the best illustration of a
distributed data store. A central set of servers serves as the
game's backbone, with the rest of the workstations doing
additional tasks.
There is no procedure that can't be run using a distributed
database system because of its advantages. The distributed data
storage system can include any type of device, from a simple cell
phone to smartwatches. This demonstrates the enormous
potential and scope of cloud distributed database services, as well
as the opportunity for the development of more powerful
distributed data stores.
Q10) What do you mean by distributed transaction?
A10) A distributed transaction is a set of data operations that
spans two or more data repositories (especially databases). It is
typically coordinated across separate nodes connected by a
network, but may also span multiple databases on a single server.
There are two possible outcomes: 1) all operations successfully
complete, or 2) none of the operations are performed at all due to
a failure somewhere in the system. In the latter case, if some
work was completed prior to the failure, that work will be
reversed to ensure no net work was done. This operation adheres
to the “ACID” (atomicity, consistency, isolation, and durability)
database principles, which ensure data integrity. ACID is most
frequently associated with single-database-server transactions,
whereas distributed transactions extend that guarantee to several
databases.
A distributed transaction is a type of operation known as a "two-
phase commit" (2PC). The XA protocol, which is one
implementation of a two-phase commit process, is used in "XA
transactions."
Fig 6: A distributed transaction spans multiple databases and
guarantees data integrity
Distributed transaction work
The processing requirements for distributed transactions are the
same as for regular database transactions, but they must be
managed across multiple resources, making them more difficult
to implement for database developers. The multiple resources
add more points of failure, such as the distinct software systems
that run the resources (e.g., the database software), the extra
hardware servers, and network issues. As a result, distributed
transactions are vulnerable to failures, necessitating the
implementation of safeguards to maintain data integrity.
Transaction managers organize the resources in order for a
distributed transaction to take place (either multiple databases or
multiple nodes of a single database). The transaction manager
can be one of the data repositories that will be changed as part of
the transaction, or it can be a completely different resource
responsible just for coordination. The transaction manager
determines whether a successful transaction should be
committed or a failed transaction should be rolled back, with the
latter leaving the database unchanged.
An application first sends a request to the transaction
management for a distributed transaction. The transaction
manager then branches to each resource, each of which will have
its own "resource manager" to aid in distributed transaction
participation. To protect against incomplete updates that may
occur when a failure occurs, distributed transactions are
frequently done in two phases. The first phase, known as the
"prepare-to-commit" phase, entails acknowledging an intention to
commit. After all resources have acknowledged, the transaction is
finished by asking them to run a final commit.
Q11) Explain commit protocol?
A11) In a local database system, the transaction manager just
needs to inform the recovery manager of the decision to commit
a transaction. In a distributed system, however, the transaction
manager should communicate and uniformly enforce the decision
to commit to all servers in the various sites where the transaction
is being conducted. When each site's processing is complete, it
enters the partially committed transaction state and waits for all
other transactions to enter this state. It begins to commit when it
receives the message that all of the sites are ready. Either all
sites commit or none of them commit in a distributed system.
The various distributed commit protocols are as follows:
● One-phase commit
● Two-phase commit
● Three-phase commit
One phase commit
The simplest commit technique is distributed one-phase commit.
Consider a scenario in which the transaction is carried out on a
master site and a number of slave sites. The stages involved in a
distributed commit are as follows:
● Each slave sends a "DONE" notification to the controlling site
once it has completed its transaction locally.
● The slaves wait for the controlling site to send a "Commit" or
"Abort" message. This period of waiting is referred to as the
"window of vulnerability."
● The controlling site decides whether to commit or abort when
each slave sends a "DONE" message. The commit point is what
it's called. The message is then sent to all of the slaves.
● A slave either commits or aborts after receiving this message,
and then sends an acknowledgement message to the controlling
site.
Two phase commit
The vulnerability of one-phase commit protocols is reduced by
distributed two-phase commits. The steps performed in the two
phases are as follows −
Phase 1: Prepare Phase
● Each slave sends a "DONE" notification to the controlling site
once it has completed its transaction locally. When all slaves have
sent a "DONE" message to the controlling site, it sends a
"Prepare" message to the slaves.
● The slaves vote on whether they still want to commit or not. If
a slave wants to commit, it sends a “Ready” message.
● A slave who refuses to commit sends the message "Not
Ready." When the slave has conflicting concurrent transactions or
there is a timeout, this can happen.
Phase 2: Commit/Abort Phase
After all of the slaves have sent "Ready" messages to the
controlling site,
● The slaves receive a "Global Commit" message from the
controlling site.
● The slaves complete the transaction and send the controlling
site a "Commit ACK" message.
● The transaction is considered committed when the controlling
site receives "Commit ACK" messages from all slaves.
After any slave sends the first "Not Ready" notification to the
controlling site,
● The slaves receive a "Global Abort" notification from the
controlling site.
● The slaves abort the transaction and send the controlling site
a "Abort ACK" message.
● The transaction is considered aborted when the controlling
site receives "Abort ACK" messages from all slaves.
Three phase commit
The following are the steps in a distributed three-phase commit:
Phase 1: Prepare Phase
The methods are identical to those for a distributed two-phase
commit.
Phase 2: Prepare to Commit Phase
The controlling site broadcasts the phrase "Enter Prepared State."
In response, the slave sites vote "OK."
Phase 3: Commit / Abort Phase
The methods are identical to those for a two-phase commit, with
the exception that no “Commit ACK”/ “Abort ACK” message is
necessary.
Q12) Describe concurrency control in a distributed
database?
A12) In this section, we'll look at how the concepts mentioned
above are implemented in a distributed database system.
Distributed Two phase Locking Algorithm
The underlying principle of distributed two-phase locking is
identical to that of traditional two-phase locking. In a distributed
system, however, there are locations that are designated as lock
managers. Lock acquisition requests from transaction monitors
are managed by a lock manager. To ensure that lock managers at
different sites work together, at least one site is given the
authority to view all transactions and detect lock conflicts.
There are three types of distributed two-phase locking
approaches, depending on the number of locations that can
identify lock conflicts.
● Centralized two-phase locking - One site is designated as
the central lock manager in this method. The central lock
manager's location is known by all sites in the environment, and it
is used to obtain locks during transactions.
● Primary copy two-phase locking - A number of locations
are designated as lock control centers in this strategy. Each of
these locations is responsible for a specific set of locks. All of the
sites are aware of which lock control center is in charge of
whatever data table/fragment item's lock.
● Distributed two-phase locking - There are several lock
managers in this technique, each of which controls locks on data
items stored at its local site. The lock manager's position is
determined by data dissemination and replication.
Distributed Timestamp Concurrency Control
The physical clock reading determines the timestamp of any
transaction in a centralized system. However, because local
physical/logical clock readings are not globally unique, they
cannot be used as global timestamps in a distributed system. A
timestamp is made up of the site ID and the clock reading for that
site.
Each site has a scheduler that keeps a separate queue for each
transaction manager in order to apply timestamp ordering
techniques. A transaction manager sends a lock request to the
site's scheduler during the transaction. The scheduler assigns the
request to the appropriate queue in order of increasing
timestamp. Requests are processed from the front of the queues
in the order of their timestamps, i.e. the oldest first.
Conflict Graphs
Creating conflict graphs is another option. Transaction classes
have been defined for this purpose. The read set and write set are
two sets of data elements in a transaction class. If the
transaction's read set is a subset of the class's read set and the
transaction's write set is a subset of the class's write set, the
transaction belongs to that class. Each transaction issues read
requests for the data items in its read set during the read phase.
Each transaction issues its own write requests during the write
phase.
For the classes that active transactions belong to, a conflict graph
is produced. There are vertical, horizontal, and diagonal edges in
this. A vertical edge connects two nodes of a class and indicates
class conflicts. A horizontal edge joins two nodes from different
classes and indicates a write-write conflict between them. A
diagonal edge joins two nodes from different classes, indicating a
write-read or read-write conflict between them.
The conflict graphs are examined to see if two transactions from
the same class or from two distinct classes can be executed
simultaneously.
Q13) Write the features of a distributed database?
A13) Features
● It's used to create, retrieve, update, and destroy databases
that are distributed.
● It synchronizes the database on a regular basis and provides
access mechanisms, making the distribution transparent to the
users.
● It ensures that data modified on any site is updated across
the board.
● It's utilized in applications where a lot of data is processed
and accessible by a lot of people at the same time.
● It's made to work with a variety of database platforms.
● It protects the databases' confidentiality and data integrity.
Q14) Write the advantages of data replication?
A14) Advantages of data Replication
Data Replication is generally performed to:
● To provide a consistent copy of data across all the database
nodes.
● To increase the availability of data.
● The reliability of data is increased through data replication.
● Data Replication supports multiple users and gives high
performance.
● To remove any data redundancy, the databases are merged
and slave databases are updated with outdated or incomplete
data.
● Since replicas are created there are chances that the data is
found itself where the transaction is executing which reduces the
data movement.
● To perform faster execution of queries.
Q15) Write the disadvantages of data replication?
A15) Disadvantages of data replication
● More storage space is needed as storing the replicas of same
data at different sites consumes more space.
● Data Replication becomes expensive when the replicas at all
different sites need to be updated.
● Maintaining Data consistency at all different sites involves
complex measures.
Q16) Write the disadvantages of fragmentations?
A16) Disadvantages of fragmentation
● When data from different fragments are required, the access
speeds may be very low.
● In case of recursive fragmentations, the job of reconstruction
will need expensive techniques.
● Lack of back-up copies of data in different sites may render
the database ineffective in case of failure of a site.