0% found this document useful (0 votes)
31 views20 pages

Database Design Guidelines and Anomalies

The document outlines informal design guidelines for database schema, emphasizing the importance of semantics, redundancy reduction, null value management, and avoiding spurious tuples. It explains update anomalies, including insertion, deletion, and modification issues that arise from poor schema design. Additionally, it covers functional dependencies, normalization processes, and the definitions of normal forms to ensure efficient database design and integrity.

Uploaded by

sunidhinaik02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views20 pages

Database Design Guidelines and Anomalies

The document outlines informal design guidelines for database schema, emphasizing the importance of semantics, redundancy reduction, null value management, and avoiding spurious tuples. It explains update anomalies, including insertion, deletion, and modification issues that arise from poor schema design. Additionally, it covers functional dependencies, normalization processes, and the definitions of normal forms to ensure efficient database design and integrity.

Uploaded by

sunidhinaik02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DATABASE MANAGEMENT SYSTEM BCS403

Explain the informal design guidelines of a database(10M)


There are four informal measures of quality for relation schema design as follows:
➢ Semantics of the relation attributes
➢ Reducing the redundant values in tuples
➢ Reducing the null values in tuples
➢ Disallowing the possibility of generating spurious tuples (unwanted)

[Link] of the relation attributes:


• semantics of a relation refers to its meaning resulting from the interpretation of attribute values in a
tuple
• Whenever we group attributes to form a relation schema, we assume that attributes belonging to one
relation have certain real-world meaning and a proper interpretation associated with them
• The easier it is to explain the semantics of the relation, the better the relation schema design will be.
Guideline 1
Design a relation schema such that it is easy to explain its meaning. Do not combine attributes from multiple entity
types and relationship types into a single relation.

• Both the relation schemas have clear semantics


• A tuple in the EMP_DEPT relation schema represents a single employee but includes additional
information the name (Dname) of the department for which the employee works and the Social
Security number (Dmgr_ssn) of the department manager.
• A tuple in the EMP_PROJ relates an employee to a project but also includes the employee name
(Ename), project name (Pname), and project location (Plocation) logically correct but they violate
Guideline 1 by mixing attributes from distinct real-world entities:
• EMP_DEPT mixes attributes of employees and departments
• EMP_PROJ mixes attributes of employees and projects and the WORKS_ON relationship
[Link] Information in Tuples and Update Anomalies
• The goal of a schema design is to reducing redundant values in tuples, save storage space and avoid
update anomalies.
• One goal of schema design is to minimize the storage space used by the base relations
• Grouping attributes into relation schemas has a significant effect on storage space
• For example, compare the space used by the two base relations EMPLOYEE and DEPARTMENT in
FigureA with that for an EMP_DEPT base relation in Figure B.
• In EMP_DEPT, the attribute values pertaining to a particular department (Dnumber, Dname,
Dmgr_ssn) are repeated for every employee who works for that department.

Smithashree K, Asst Professor, Dept of CSE, MITK 1


DATABASE MANAGEMENT SYSTEM BCS403

EMPLOYEE DEPARTMENT
SSN NAME ADDRESS BDATE DNUMBER
111 John 1 DNAME DNUMBER MGR_SSN
222 Ram 1 Research 1 101
333 Sita 2 Admin 2 104
444 Kishan 2 Headqtrs 3 105
555 Mary 3
Figure A

EMP_DEPT
SSN NAME ADDRESS DNUMBER DNAME MGR_SSN
111 John 1 Research 101
222 Ram 1 Research 101
333 Sita 2 Admin 104
444 Kishan 2 Admin 104
555 Mary 3 Headqtrs 105
Figure B
In Figure A department’s information appears only once in the DEPARTMENT relation. Only the Dnumber
is repeated in the EMPLOYEE relation for each employee who works in that department as a foreign key.
Another serious problem with using the relation in Figure B as Base relation is Update Anomalies.
Explain the different update Anomalies of tables? (5marks)
Update anomalies can be classified into:
❖ Insertion anomalies
❖ Deletion anomalies
❖ Modification anomalies
Insertion anomalies
Insertion anomalies can be differentiated into two types :
1. To insert a new employee into EMP_DEPT we must include either the attribute values for the
department that the employee works for, or null.
o Example: To insert a new tuple for an employee who works in department number 5, the
attribute values of department 5 should be entered correctly so that they are consistent with
values for department 5 in other tuples in EMP_DEPT.
o In figure A we need not worry about this because we enter only department number in
employee tuple , other attribute values of department 5 are recorded only once in the database
as a single tuple in DEPARTMENT relation.
2. It is the difficult to insert a new department that has no employees as yet in EMP_DEPT relation.
To do this place null values in the attributes of EMPLOYEE relation. This cause problem because
SSN is the primary key of EMP_DEPT, and each tuple is used to represent an employee entity.
o This problem does not occur in the design of Figure A because a department is entered in the
DEPARTMENT relation whether or not any employees work for it, and whenever an employee
is assigned to that department, a corresponding tuple is inserted in EMPLOYEE
Deletion Anomalies
o If we delete from EMP_DEPT an employee tuple that happens to represent the last employee working
for a particular department, the information concerning that department is lost from the database
o This problem does not occur in the database of Figure A because DEPARTMENT tuples are stored
separately.
Modification anomalies
o In EMP_DEPT, if we change the values of one of the attributes of a particular employee, we must update
the tuples of the all employees who works in that department; otherwise, the database will become
inconsistent.

Smithashree K, Asst Professor, Dept of CSE, MITK 2


DATABASE MANAGEMENT SYSTEM BCS403

o If we fail to update some tuples, the same department will be shown to have two different values, which
would be wrong.
Guideline 2:
o Design a base relation schema such that no insertion, deletion, or modification anomalies are present
in the relations. If any anomalies are present, note them and make sure that the programs that update
the database will operate correctly.
[Link] values in tuples
o If many of the attributes do not apply to all tuples in the relation, we end up with many NULLs in
those tuples
- this can waste space at the storage level
- may lead to problems with understanding the meaning of the attributes
- may also lead to problems with specifying JOIN operations
- how to account for them when aggregate operations such as COUNT or SUM are
Applied
o The nulls can have multiple interpretations, such as the following:
• The attribute does not apply to this tuple.
• The attribute value for this tuple is unknown.
• The value is known but absent; that is, it has not been recorded yet.
Guideline #3:
Avoid placing attributes in a base relation whose values may frequently be null and nulls are unavoidable,
make sure that they apply in exceptional cases only and do not apply to a majority of tuples in the relation.
[Link] of spurious (fake) tuples
Guideline #4: Design relation schemas so that they can be joined with equality conditions on attributes
that are either primary keys or foreign keys in a way that guarantees that no spurious tuples are generated.

As shown in figure1 we have used EMP_PROJ1 and EMP_LOCs as the base relations instead of EMP_PROJ

Figure 1
If we attempt NATURAL JOIN operation on EMP_PROJ1 and EMP_LOCS, the result produces many more tuples
than the original set of tuples in EMP_PROJ additional tuples that were not in EMP_PROJ are called spurious
tuples because they represent wrong information ―that is not valid‖
➢ Decomposing EMP_PROJ into EMP_PROJ1 and EMP_LOCS and is undesirable because, when we join them
back using natural join, we do not get the correct original information.

Smithashree K, Asst Professor, Dept of CSE, MITK 3


DATABASE MANAGEMENT SYSTEM BCS403

What is Functional Dependencies ?


A Functional Dependency (FD) is a constraint between two sets of attributes from the database. FDs are used to
specify formal measures of the "goodness" of relational designs. FDs are derived from the real-world constraints
on the attributes.
Definition:
A functional dependency denoted by fd:X→Y, between two sets of attributes X and Y that are subsets of
R specifies a constraint on the possible tuples that can form a relation state r of R.
The constraint is that for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y]
= t2[Y].

This means that the values of the Y component of a tuple in r depend on, or are determined by the values of the X
components.
The values of the X component of a tuples uniquely determine the values of the Y component therefore there is a
functional dependency from X to Y, or that Y is functionally dependent on X.

A FD: X→ Y is a fully functional dependency if removal of any attribute from X means that the dependency does
not hold any more; otherwise, it is a partial functional dependency.
Example:-
a) fd1:SSN→Ename
b) fd2:Pnumber → {Pname, Plocation}
c) fd3 :{SSN, Pnumber} →hours
The functional dependency specify that
(a) The value of an employee SSN uniquely determines the employee name (E name).
(b) The value of a P number uniquely determines the Pname and Plocation.
(c) A combination of SSN and Pnumber values uniquely determine the no of hours the employee currently works
on the project per week.

Normalization Based On Primary Key


The purpose of normalization ensures the followings:
The elimination of problems associated with redundant data.
The identification of various types of update anomalies such as insertion, deletion, and modification anomalies.
How to recognize the appropriateness or quality of the design of relations.
The concept of functional dependency, the main tool for measuring the appropriateness of attribute groupings
in relations.
How functional dependencies can be used to group attributes into relations that are in a known normal form.

Smithashree K, Asst Professor, Dept of CSE, MITK 4


DATABASE MANAGEMENT SYSTEM BCS403

Normalization of Relations
Normalization is a process of analysing the given relation schemas based on their FDs and primary keys to achieve
the desirable properties .i.e.,
1. Minimizing redundancy
2. Minimizing the insertion, deletion and update anomalies

The normalization procedures provides database designer with the following approaches:
A formal frame work for analyzing relation schemas based on their keys and on the functional dependencies
among their attributes.
A serious of normal form tests that can be carried out on individual relation schemas so that the relational
database can be normalized to any desired degree.

The process of normalization through decomposition must also confirm the existence of additional properties that
the relational would include two properties.

1. The lossless join or non–additive join property which guarantees that the spurious tuples generation problem
does not occur with respect to the relation schemas created after decomposition
2. The Dependency preservation property, which ensures that each FD is represented in some individual relation
resulting after decomposition.
The database designer need not normalize to the highest possible normal form relations may be left in a lower
normalization states that the process of storing the join of higher normal form relations as a base relation is known
as de-normalization

Definition:
A super key of a relation schema R{A1,A2,…..An} is a set of attributes SR with the property that no two tuples
t1 and t2 in any legal relation state r of R will have t1[S]=t2[S]. A key K is a super key with the additional property
that removal of any attributes from K will cause K not be a super key any more.
The difference between a key and a superkey is that a key has to be minimal; that is, if we have a key K = {A1,
A2, ... , Ak} of R, then K - {Ai} 1≤ i≤ k is not a key of R.

For example {SSN} is a key for EMPLOYEE, whereas {SSN}, {SSN, ENAME}, {SSN, ENAME, BDATE},
and any set of attributes that includes SSN are all superkeys.

If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily
designated to be the primary key, and the others are called secondary keys or alternate keys.

Definition:
An attribute of relation schema R is called a prime attribute of R if it is a member of some candidate key of R.
An attribute is called nonprime if it is not a prime attribute-that is, if it is not a member of any candidate key.

First Normal Form (1 NF)


1NF disallows multi valued attributes, composite attributes, and their combination. It states that the domain of an
attributes must include only atomic values and that the values of any attributes in a tuple must be a single value
from the domain of the attributes.
1NF disallow ―relations within relations" or "relations as attribute values within tuples."
The only attribute values permitted by lNF are single atomic values.
Ex:-consider the DEPARTMENT relation schema , whose primary key is DNUMBER

Smithashree K, Asst Professor, Dept of CSE, MITK 5


DATABASE MANAGEMENT SYSTEM BCS403

Fig 5.4 Normalization into 1NF. (a)A relation schema that is not in 1NF. (b) Example state of relation
DEPARTMENT. (c) 1NF version of same relation with redundancy.

Each department can have any number of locations, which is it does not satisfy the 1NF because DLOCATIONS
is not an atomic attribute.
These are the three main techniques to achieve first normal form for such a relation:-
1. Remove the attributes DLOCATIONS that violates 1NF and place it in a separate relation DEPT_LOCATION
along with the primary key DNUMBER of DEPARTMENT. The primary key of DEPT_LOCATION is the
combination {DNUMBER, DLOCATIONS}.
Expand the key so that there will be a separate tuple in the original DEPARTMENT relation for each location of a
DEPARTMENT. The primary key becomes the combination {DNUMBER, DLOCATION} is as shown in figure
5.4(c). This solution has the disadvantage of introducing redundancy in the relation.
2. If a maximum numbers of values are known for the attribute-for example, if it is known that at most three
locations can exist for a department-replace the DLOCATIONS attribute by three atomic attributes:
DLOCATIONl, DLOCATION2, and DLOCATION3. This solution has the disadvantage of introducing null
values if most departments have fewer than three locations.
Among the three solutions above, the first is generally considered best because it does not suffer from redundancy.

Second Normal Form (2 NF)


Second normal form (2NF) is based on the concept of full functional dependency.
Definition: A relation schema R is said to be in 2NF if it satisfies 1 NF and every nonprime attribute A in R is
fully functionally dependent on the primary key of R.

In Figure 5.6, {SSN, PNUMBER} → HOURS is a full dependency because HOURS is dependent on primary key
{SSN, PNUMBER}. However, SSN → ENAME and PNUMBER→PNAME dependency is partial because
ENAME and PNAME are dependent on part of the primary key i.e SSN and PNUMBER.

Figure 5.6 Normalizing EMP_PROJ into 2NF relations.

➢ The EMP_PROJ relation in Figure 5.6 is in INF but is not in 2NF.

Smithashree K, Asst Professor, Dept of CSE, MITK 6


DATABASE MANAGEMENT SYSTEM BCS403

➢ The nonprime attribute ENAME violates 2NF because of FD2, as do the nonprime attributes PNAME and
PLOCATION because of FD3.
➢ The functional dependencies FD2 and FD3 make ENAME, PNAME, and PLOCATION partially
dependent on the primary key {SSN, PNUMBER} of EMP_PROJ, thus violating the 2NF.
➢ The functional dependencies FDI, FD2, and FD3 in Figure 5.6 hence lead to the decomposition of
EMP_PROJ into the three relation schemas EPl, EP2, and EP3 shown in Figure 5.6, each of which is in
2NF.

Third Normal Form (3 NF):


Third normal form (3NF) is based on the concept of transitive dependency.
Definition: A relation schema R is said to be in 3NF if it satisfies 2NF and no nonprime attribute of R is transitively
dependent on the primary key.

Figure 5.7 Normalizing EMP_DEPT into 3NF relations.

➢ Example: The dependency SSN → DMGRSSN is transitive through DNUMBER in relation EMP_DEPT
of Figure 5.7 because both the dependencies SSN →DNUMBER and DNUMBER → DMGRSSN hold
and DNUMBER is neither a key itself nor a subset of the key of EMP_DEPT.
➢ However, EMP_DEPT is not in 3NF because of the transitive dependency of DMGRSSN and also
DNAME on SSN via DNUMBER.
➢ We can normalize EMP_DEPT by decomposing it into the two 3NFrelation schemas EDl and ED2 shown
in Figure 5.7. A NATURAL JOIN operation on EDI and ED2 will recover the original relation EMP_DEPT
without generating spurious tuples.

General Definition of Second Normal Form


➢ Definition: A relation schema R is in second normal form (2NF) if every nonprime attribute A in R is not
partially dependent on any key of R.
➢ The test for 2NF involves testing for functional dependencies whose left-hand side attributes are part of
the primary key.
➢ Example: Consider the relation schema LOTS shown in Figure 5.8 (a), which describes land for sale in
various counties of a state. Suppose that there are two candidate keys: PROPERTY_ID# and
{COUNTY_NAME, LOT#}; that is, lot numbers are unique only within each county, but PROPERTY_ID
numbers are unique across counties for the entire state.

Based on the two candidate keys PROPERTY_ID# and {COUNTY_NAME, LOT#}, the functional dependencies
FD1 and FD2 of Figure 5.8(a) satisfies 2NF.

Smithashree K, Asst Professor, Dept of CSE, MITK 7


DATABASE MANAGEMENT SYSTEM BCS403

FIGURE 5.8 Normalization into 2NF . (a) The LOTS relation with its functional dependencies FDl through FD4. (b)
Decomposing into the 2NF relations LOTSl and LOTS2

The LOTS relation schema FD3 violates 2NF because TAX_RATE is partially dependent on the candidate key
{COUNTY_NAME, LOT#}. To normalize LOTS into 2NF, we decompose it into the two relations LOTSl and
LOTS2, shown in Figure 5.8(b).
Construct LOTS1 by removing the attribute TAX_RATE that violates 2NF from LOTS and placing it with
COUNTYNAME into another relation LOTS2. Both LOTSl and LOTS2 are in 2NF. Notice that FD4 does not
violate 2NF and is carried over to LOTSl.

General Definition of Third Normal Form


Definition: A relation schema R is said to be in third normal form (3NF) if it satisfies 2NF and whenever a
nontrivial functional dependency X→ A holds in R, either (a) X is a super key of R, or (b) A is a prime attribute
of R.

FIGURE 5.9 Normalization into 3NF (c) Decomposing LOTSl into the 3NF relations LOTSIA and LOTSIB.
(d) Summary of the progressive normalization of LOTS.

Smithashree K, Asst Professor, Dept of CSE, MITK 8


DATABASE MANAGEMENT SYSTEM BCS403

As shown in figure5.9 (b) LOTS2 is in 3NF. However, FD4 in LOTSl violates 3NF because AREA is not a
superkey and PRICE is not a prime attribute in LOTSl. To normalize LOTSl into 3NF, we decompose it into the
relation schemas LOTSlA and LOTSlB shown in Figure 5.9(c).
We construct LOTSlA by removing the attribute PRICE that violates 3NF from LOTSl and placing it with AREA
(the left-hand side of FD4 that causes the transitive dependency) into another relation LOTSlB. Both LOTSlA and
LOTSlB are in 3NF.

BOYCE-CODD Normal Form (BCNF)


Boyce-Codd Normal Form (BCNF) was proposed as a simpler form of 3NF,but it was found to be stricter than
[Link] is, every relation in BCNF is also in 3NF;however, a relation in 3NF is not necessarily in BCNF.
Definition: A relation schema R is in BCNF if whenever a nontrivial functional dependency X → A holds in R,
then X is a super key of R.
The only difference between the definitions of BCNF and 3NF is that, in 3NF for X→A dependency
a) X is a super key or b) A is a prime attribute of R.
But for BCNF condition A is a prime attribute of R is absent.

FIGURE 5.10 Boyce-Codd normal form. (a) BCNF normalization of LOTS1A with the functional dependency FD2 being
lost in the decomposition. (b) A schematic relation with FDS; it is in 3NF, but not in BCNF.

As shown in figure 5.10(a) FD5 violates BCNF in LOTSIA because AREA is not a super key of LOTSlA. But
FD5 satisfies 3NF in LOTSIA because COUNTY_NAME is a prime attribute (condition (b)), but this condition
does not exist in the definition of BCNF.
We can decompose LOTSIA into two BCNF relations LOTSlAX and LOTSlAY as shown in Figure 5.10(a).

Multivalued Dependency and Fourth Normal Form


For example, consider the relation EMP shown in Figure below:

➢ A tuple in this EMP relation represents the fact that an employee whose name is Ename works on the
project whose name is Pname and has a dependent whose name is Dname.
➢ An employee may work on several projects and may have several dependents and the employee’s projects
and dependents are independent of one another.
➢ To keep the relation state consistent and to avoid any spurious relationship between the two independent
attributes, we must have a separate tuple to represent every combination of an employee’s dependent and
an employee’s project.

Smithashree K, Asst Professor, Dept of CSE, MITK 9


DATABASE MANAGEMENT SYSTEM BCS403

➢ In the relation state shown in EMP the employee with Ename Smith works on two projects ‘X’ and ‘Y’
and has two dependents ‘John’ and ‘Anna’, and therefore there are four tuples to represent these facts
together.
➢ The relation EMP is an all-key relation (with key made up of all attributes) and therefore has no f.d.’s and
as such qualifies to be a BCNF relation.
➢ There is an redundancy in the relation EMP—the dependent information is repeated for every project and
the project information is repeated for every dependent.
➢ To address this situation, the concept of multivalued dependency(MVD) was proposed and based on this
dependency, the fourth normal form was defined.
➢ Multivalued dependencies are a consequence of 1NF which disallows an attribute in a tuple to have a set
of values, and the accompanying process of converting an unnormalized relation into 1NF.
➢ Informally, whenever two independent 1:N relationships are mixed in the same relation, R(A, B, C), an
MVD may arise.

Formal Definition of Multivalued Dependency


A multivalued dependency X → Y specified on relation schema R, where X and Y are both subsets of R,
specifies the following constraint on any relation state r of R.
If two tuples t1 and t2 exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with
the following properties,19 where we use Z to denote (R − (X ∪ Y)).
■t3[X] = t4[X] = t1[X] = t2[X]
■ t3[Y] = t1[Y] and t4[Y] = t2[Y]
■ t3[Z] = t2[Z] and t4[Z] = t1[Z]

Smithashree K, Asst Professor, Dept of CSE, MITK 10


DATABASE MANAGEMENT SYSTEM BCS403

➢ We now present the definition of fourth normal form (4NF), which is violated when a relation has
undesirable multivalued dependencies, and hence can be used to identify and decompose such relations

Definition. A relation schema R is in 4NF with respect to a set of dependencies F (that includes functional
dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X →→ Y in F+, X
is a superkey for R.

The process of normalizing a relation involving the nontrivial MVDs that is not in 4NF consists of decomposing it
so that each MVD is represented by a separate relation where it becomes a trivial MVD.

➢ We decompose EMP into EMP_PROJECTS and EMP_DEPENDENTS Both EMP_PROJECTS and


EMP_DEPENDENTS are in 4NF, because the MVDs

EMP_DEPENDENTS are trivial MVDs.


➢ No other nontrivial MVDs hold in either EMP_PROJECTS or EMP_DEPENDENTS. No FDs hold in these
relation schemas either.

Smithashree K, Asst Professor, Dept of CSE, MITK 11


DATABASE MANAGEMENT SYSTEM BCS403

Join Dependencies and Fifth Normal Form

Smithashree K, Asst Professor, Dept of CSE, MITK 12


DATABASE MANAGEMENT SYSTEM BCS403

CHAPTER 6: SQL

Smithashree K, Asst Professor, Dept of CSE, MITK 13


DATABASE MANAGEMENT SYSTEM BCS403

CREATE command
An SQL schema is identified by a schema name, and includes an authorization identifier to indicate the user
or account who owns the schema, as well as descriptors for each element in

the schema.
Schema elements include tables, constraints, views, domains, and other constructs that describe the schema.
A schema is created via the CREATE SCHEMA statement, which can include all the schema elements
definitions.

For example, the following statement creates a schema called COMPANY, owned by the user with
authorization identifier ‘MKUMAR’.
CREATE SCHEMA COMPANY AUTHORIZATION ‘MKUMAR’;
CREATE TABLE Command:
• The CREATE TABLE command is used to specify a new relation by giving it a name and specifying its
attributes and initial constraints. The attributes are specified first, and each attribute is given a name, a data
type to specify its domain of values, and any attribute constraints, such as NOT NULL.
• Alternatively, we can explicitly attach the schema name to the relation name, separated by a period. For
example,
CREATE TABLE [Link] ...
rather than
CREATE TABLE EMPLOYEE
( Fname VARCHAR(15) NOT NULL,
Minit CHAR,
Lname VARCHAR(15) NOT NULL,
Ssn CHAR(9) NOT NULL,
Bdate DATE,
Address VARCHAR(30),
Sex CHAR,
Salary DECIMAL(10,2),

Smithashree K, Asst Professor, Dept of CSE, MITK 14


DATABASE MANAGEMENT SYSTEM BCS403

Super_ssn CHAR(9),
Dno INT NOT NULL,
PRIMARY KEY (Ssn),
FOREIGN KEY (Super_ssn) REFERENCES EMPLOYEE(Ssn),
FOREIGN KEY (Dno) REFERENCES DEPARTMENT(Dnumber) );
The relations declared through CREATE TABLE statements are called base tables (or base relations); this
means that the relation and its tuples are actually created and stored as a file by the DBMS.

Explain different constraints available in SQL with examples?


[Link] Attribute constraints and Attribute defaults
➢ NOT NULL: A constraint NOT NULL may be specified if NULL is not permitted for a particular attribute. Primary
key is always NOT NULL
Example- Create table employee(
…..
name varchar(12) NOT NULL,
…);
➢ DEFAULT: It is also possible to define a default value for an attribute by appending the clause
DEFAULT <value> to an attribute definition.
• Default value is NULL for attributes that do not have the NOT NULL constraint( if default clause is not
specified).
• Example: Specifying a default manager for a new department
create table department
( Dnumber int NOT NULL,
mgreno char(9) NOT NULL DEFAULT ‘101’,
….);
➢ CHECK: can restrict attribute or domain values using the CHECK clause following an attribute or domain
definition.
For example, suppose that department numbers are restricted to integer numbers between 1 and 20; then, we can
change the attribute declaration of Dnumber in the DEPARTMENT table
Dnumber INT NOT NULL CHECK (Dnumber > 0 AND Dnumber < 21);
2. Specifying Key and Referential Integrity constraints
➢ The PRIMARY KEY clause specifies one or more attributes that make up the primary
key of a relation. This constraint specifies that attribute value must not be NULL and value must be unique across a
column.
• For example, the primary key of DEPARTMENT can be specified as follows
Dnumber INT PRIMARY KEY;
➢ The UNIQUE clause specifies alternate (secondary) keys,
➢ Referential integrity is specified via the FOREIGN KEY clause.
• If a referential integrity constraint is violated the designer can specify an alternative
action to be taken by attaching a referential triggered action clause to any foreign key constraint.
• The options include SET NULL, CASCADE, and SET DEFAULT. An option must be qualified with either ON
DELETE or ON UPDATE.
Example: Create table EMPLOYEE
(...........
Dno INT NOT NULL,
UNIQUE (DNAME),
foreign key (Dno) references DEPARTMENT(Dnumber) ON DELETE SET NULL ON UPDATE CASCADE);
• This means if a department tuple is deleted, then the value of Dno in EMPLOYEE table is automatically set to
NULL for all employees who work in that particular department.
• On the other hand, if Dnumber in DEPARTMENT is updated then new value is cascaded to Dno for all
EMPLOYEE tuples referencing the updated Dnumber.
• The action for CASCADE ON DELETE is to delete all the referencing tuples, whereas the action for CASCADE
ON UPDATE is to change the value of the foreign key to the updated (new) primary key value for all referencing
tuples.
3. Giving Names to Constraints

Smithashree K, Asst Professor, Dept of CSE, MITK 15


DATABASE MANAGEMENT SYSTEM BCS403

➢ constraint can be given a constraint name, followed by a keyword CONSTRAINT. The names of all constraints
within a particular schema must be unique.
➢ Example Create table Employee
(Dno int NOT NULL DEFAULT 1,
CONSTRAINT EMPPK primary key(SSN),
.....) ;
4. Specifying Constraints on Tuples Using CHECK
➢ Table constraints can be specified through additional CHECK clauses at the end of a CREATE TABLE statement.
These can be called tuple-based constraints because they apply to each tuple individually and are checked whenever
a tuple is inserted or modified.
➢ For example, suppose that the DEPARTMENT table had an additional attribute Dept_create_date, which stores the
date when the department was created. Then we could add the following CHECK clause at the end of the CREATE
TABLE statement for the DEPARTMENT table to make sure that a manager’s start date is later than the department
creation date.
CHECK (Dept_create_date <= Mgr_start_date);
.
Explain different schema change statements in SQL with examples?
The schema change statement in SQL are
1. The DROP command
2. The ALTER command
[Link] command- can be used to drop named schema elements such as tables,domains or constraints. One can also
drop a schema.
➢ There are two drop behaviour options- CASCADE and RESTRICT.
CASCADE- For example, to remove the COMPANY database schema and all its tables,domains and other
elements, the CASCADE option is used as follows
DROP SCHEMA COMPANY CASCADE ;
RESTRICT- If the RESTRICT option is chosen in place of CASCADE, the schema is dropped only if it has no
elements in it.
➢ If a base relation within a schema is not needed any longer,the relation and its definition can be deleted by using the
DROP TABLE command.
For example,if we no longer wish to keep track of dependents of employees in the COMPANY database then we
can get rid of the DEPENDENT relation by using the command.
DROP TABLE DEPENDENT CASCADE ;
➢ If the RESTRICT option is chosen instead of CASCADE, a table is dropped only if it is not referenced in any other
relation(as foreign key).
➢ With the CASCADE option, all constraints and views that reference the table are dropped automatically from the
schema, along with the table itself.
[Link] command- The definition of a base table or of other named schema elements can be changed by using the
ALTER command.
➢ ALTER table actions include
• Adding or dropping a column
• Changing a column definition
• Adding or dropping table constraints
➢ Adding or dropping column- To add an attribute Job to EMPLOYEE relation in the COMPANY schema, we can
use the command
ALTER TABLE [Link] ADD COLUMN JOB VARCHAR(12) ;
• To drop a column(attribute), we must choose either CASCADE or RESTRICT for drop behavior.
• If CASCADE is chosen all constraints and views that reference the column are dropped automatically from the
schema along with the column.
• If RESTRICT is chosen, the command is successful only if no views or constraints reference the column.
• For example, the following command removes the attribute ADDRESS from the EMPLOYEE base table.
ALTER TABLE [Link] DROP COLUMN Address CASCADE;
➢ Change column definition:
• It is also possible to alter a column definition by dropping an existing default clause or by defining a new default
clause. The following examples illustrate this clause:
ALTER TABLE [Link] ALTER COLUMN Mgr_ssn DROP DEFAULT;

Smithashree K, Asst Professor, Dept of CSE, MITK 16


DATABASE MANAGEMENT SYSTEM BCS403

ALTER TABLE [Link] ALTER COLUMN Mgr_ssn SET DEFAULT ‘101’;


➢ Adding or dropping table constraints:
• One can also change the constraints specified on a table by adding or dropping a named constraint. To be
dropped, a constraint must have been given a name when it was specified.
• For example, to drop the constraint named EMPSUPERFK from the EMPLOYEE relation
ALTER TABLE [Link] DROP CONSTRAINT EMPSUPERFK CASCADE;
• Once this is done, we can redefine a replacement constraint by adding a new constraint to the relation, if needed.
This is specified by using the ADD keyword in the ALTER TABLE statement followed by the new constraint

Write a note on basic queries in SQL?


SQL has one basic statement for retrieving information from a database: the SELECT statement
[Link] SELECT-FROM-WHERE Structure
SELECT <attribute list>
FROM <table list>
WHERE <condition>;
where
• <attribute list> is a list of attribute names whose values are to be retrieved by the query.
• <table list> is a list of the relation names required to process the query.
• <condition> is a conditional (Boolean) expression that identifies the tuples to be retrieved by the query.
[Link] Attribute Names, Aliasing,Renaming, and Tuple Variables
➢ In SQL, the same name can be used for two (or more) attributes as long as the attributes are in different relations. If
this is the case, and a multitable query refers to two or more attributes with the same name, we must qualify the
attribute name with the relation name to prevent ambiguity. This is done by prefixing the relation name to the
attribute name and separating the two by a period.
Example: Retrieve Employee name and department name in which the employee is
working,consider attribute Dnumber is same in both relation.
SELECT Fname, Dname
FROM EMPLOYEE, DEPARTMENT
WHERE [Link]=[Link];
➢ The ambiguity also arises in the case of queries that refer to the same relation twice,
For example, retrieve the employee’s first and last name and the first and last name of his or her immediate
supervisor.
SELECT [Link], [Link], [Link], [Link]
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Super_ssn=[Link];
➢ In this case, we are required to declare alternative relation names E and S, called aliases or tuple variables, for the
EMPLOYEE relation.
➢ An alias can follow the keyword AS.
3. Unspecified WHERE Clause and Use of the Asterisk
➢ A missing WHERE clause indicates no condition on tuple selection; hence, all tuples of the relation specified in the
FROM clause qualify and are selected for the query result.
➢ If more than one relation is specified in the FROM clause and there is no WHERE clause, then the CROSS
PRODUCT—all possible tuple combinations—of these relations is selected
➢ Example : Select all EMPLOYEE SSNs
Query: SELECT Ssn
FROM EMPLOYEE;
Example : Select all combinations of EMPLOYEE Ssn and DEPARTMENT Dname
Query: SELECT Ssn, Dname
FROM EMPLOYEE, DEPARTMENT;
➢ To retrieve all the attribute values of the selected tuples, we do not have to list the attribute names explicitly in SQL;
specify an asterisk (*), which stands for all the attributes.
➢ Example : Retrieve all employee details
Query: SELECT * FROM EMPLOYEE;
[Link] as Sets in SQL

Smithashree K, Asst Professor, Dept of CSE, MITK 17


DATABASE MANAGEMENT SYSTEM BCS403

➢ DISTINCT: To eliminate duplicate tuples from the result of an SQL query, we use the keyword DISTINCT in the
SELECT clause, meaning that only distinct tuples should remain in the result.
➢ Example : Retrieve all distinct salary values
Query: SELECT DISTINCT Salary
FROM EMPLOYEE;
➢ SQL has directly incorporated some of the set operations of relational algebra. There are set union(UNION), set
difference( EXCEPT) and set intersection( INTERSECT) operations.
4. Substring Pattern Matching and Arithmetic Operators
➢ LIKE OPERATOR:This can be used for string pattern matching. Partial strings are specified using two reserved
characters:
• % replaces an arbitrary number of zero or more characters,
• underscore (_) replaces a single character.
➢ Example : Retrieve all employees whose address is in mangalore
Query: SELECT Fname,Lname
FROM EMPLOYEE
WHERE Address LIKE ’%Mangalore%’;
➢ In SQL query,the standard arithmetic operators for addition (+), subtraction (–), multiplication (*), and division (/)
can be applied to numeric values or attributes with numeric domains.
➢ Example: List employees with their salaries if 10% rise is given for all.
Query: SELECT fname,Lname,1.1* salary as “NEW SALARY”
FROM EMPLOYEE;
➢ BETWEEN operator: selects values within a given range. The values can be numbers,text or dates.
➢ Syntax: SELECT <column_name(s)>
FROM <table_name>
WHERE column_name BETWEEN value1 AND value2;
➢ Example: Retrieve all employees in department 5 whose salary is between 30000 and 40000
Query: SELECT fname,lname
FROM EMPLOYEE
WHERE (salary BETWEEN 30000 AND 40000) AND Dno=5;
5. Ordering of Query Results:
➢ SQL allows the user to order the tuples in the result of a query by the values of one or more of the attributes that
appear in the query result, by using the ORDER BYclause.
➢ Example: Display name of employees in ascending order on Fname.
Query: SELECT fname,lname
FROM EMPLOYEE
ORDER BY Fname;
➢ The default order is in ascending order. We can specify keyword DESC if we want to see the result in a descending
order of values. The keyword ASC can be used to specify ascending order explicitly.
➢ For example, if we want descending alphabetical order on Dname and ascending order on Lname, Fname, then the
ORDER BY clause for retrieving a list of employees and the projects they are working on,can be written as
SELECT Dname, Lname, Fname, .Pname
FROM DEPARTMENT , EMPLOYEE , WORKS_ON ,PROJECT
WHERE Dnumber= Dno AND Ssn= Essn AND Pno= Pnumber
ORDER BY Dname DESC, Lname ASC, Fname ASC;

INSERT, DELETE, and UPDATE Statements in SQL


1. The INSERT Command
❖ INSERT is used to add a single tuple (row) to a relation (table).
❖ Consider STUDENT relation specified with CREATE TABLE
❖ CREATE TABLE STUDENT(
ROLLNO INT,
NAME VARCHAR(20),
MARKS INT);

Smithashree K, Asst Professor, Dept of CSE, MITK 18


DATABASE MANAGEMENT SYSTEM BCS403

❖ Specify the relation name and a list of values for the tuple. The values should be listed in the
same order in which the corresponding attributes were specified in the CREATE TABLE
command.
For example, to add a new tuple to the STUDENT relation
INSERT INTO STUDENT VALUES(101,’RAM’,600);
❖ A second form of the INSERT statement allows the user to specify explicit attribute names that
correspond to the values provided in the INSERT command. This is useful if a relation has many
attributes but only a few of those attributes are assigned values in the new tuple. However, the
values must include all attributes with NOT NULL specification and no default value. Attributes
with NULL allowed or DEFAULT values are the ones that can be left out.
For example, to enter a tuple for a newSTUDENT for whom we know only the ROLLNO,NAME
then INSERT INTO STUDENT (ROLLNO,NAME)VALUES (‘103’, ‘Marini’);
It is also possible to insert into a relation multiple tuples separated by commas in a single
INSERT command. The attribute values forming each tuple are enclosed in parentheses.
❖ A DBMS that fully implements SQL should support and enforce all the integrity constraints that
can be specified in the DDL. For example, if we issue the command in U2 on the database the
DBMS should reject the operation because no STUDENT tuple exists in the database with
Dnumber = 2.

2. The DELETE Command


❖ The DELETE command removes tuples from a relation.
❖ It includes a WHERE clause, similar to that used in an SQL query, to select the tuples to be deleted.
❖ Tuples are explicitly deleted from only one table at a time. However, the deletion may propagate to
tuples in other relations if referential triggered actions are specified in the referential integrity
constraints of the DDL .
❖ Depending on the number of tuples selected by the condition in the WHERE clause, zero, one, or
several tuples can be deleted by a single DELETE command.
❖ A missing WHERE clause specifies that all tuples in the relation are to be deleted; however, the table
remains in the database as an empty table.
Query:Delete the employees whose last name is Brown
DELETE FROM EMPLOYEE Lname = ‘Brown’;
Query:Delete the employee whose SSN is 123
DELETE FROM EMPLOYEE SSN=123;

[Link] UPDATE Command:


❖ The UPDATE command is used to modify attribute values of one or more selected tuples.
❖ A WHERE clause in the UPDATE command selects the tuples to be modified from a single relation.
❖ Updating a primary key value may propagate to the foreign key values of tuples in other relations if
such a referential triggered action is specified in the referential integrity constraints of the DDL.
❖ An additional SET clause in the UPDATE command specifies the attributes to be modified and their
new values.
Query:To change the location and controlling department number of project number 10 to
‘Bellaire’ and 5, respectively
UPDATE PROJECT
SET Plocation = ‘Bellaire’, Dnum = 5
WHERE Pnumber = 10;

Smithashree K, Asst Professor, Dept of CSE, MITK 19


DATABASE MANAGEMENT SYSTEM BCS403

QUESTIONS
1. What is a Normalization? Explain the 1NF, 2NF & 3NF with examples
2. Explain informal design guidelines for relational schema design.
3. Explain the types of update anomalies in SQL with an example
4. Write the syntax of INSERT ,DELETE and UPDATE statements in SQL and explain with suitable
examples.
5. Illustrate the following with suitable examples
a. Datatypes in SQL
b. Substring Pattern matching in SQL.(REFER THE NOTES FOR EXAMPLE)
6. Explain the basic datatypes available for attributes in SQL.
7. Demonstrate the following constraints in SQL with suitable examples.
[Link] NULL [Link] Key [Link] key d. DEFAULT e. CHECK
8. Explain different schema change statements in SQL with examples?

Smithashree K, Asst Professor, Dept of CSE, MITK 20


[Link]

You might also like