0% found this document useful (0 votes)

58 views134 pages

Types and Importance of Databases

The document provides an overview of databases, their importance, types, and applications. It explains various database types such as centralized, distributed, relational, NoSQL, and cloud databases, along with their advantages and disadvantages. Additionally, it discusses the features of Database Management Systems (DBMS), key components, and user roles within a DBMS.

Uploaded by

mehertaj1717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views134 pages

Types and Importance of Databases

Uploaded by

mehertaj1717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

1

Chapter-1

A database is a collection of data, usually stored in electronic form. A database is typically designed
so that it is easy to store and access information.
A good database is crucial to any company or organisation. This is because the database stores all
the pertinent details about the company such as employee records, transactional records, salary
details etc.

The various reasons a database is important are −

Manages large amounts of data

A database stores and manages a large amount of data on a daily basis. This would not be possible
using any other tool such as a spreadsheet as they would simply not work.

Accurate

A database is pretty accurate as it has all sorts of build in constraints, checks etc. This means that
the information available in a database is guaranteed to be correct in most cases.

Easy to update data

In a database, it is easy to update data using various Data Manipulation languages (DML) available.
One of these languages is SQL.

Security of data

Databases have various methods to ensure security of data. There are user logins required before
accessing a database and various access specifiers. These allow only authorised users to access
the database.

Data integrity
2

This is ensured in databases by using various constraints for data. Data integrity in databases makes
sure that the data is accurate and consistent in a database.

Easy to research data

It is very easy to access and research data in a database. This is done using Data Query Languages
(DQL) which allow searching of any data in the database and performing computations on it.
Types of Databases

There are various types of databases used for storing different varieties of data:

1) Centralized Database

It is the type of database that stores data at a centralized database system. It comforts
the users to access the stored data from different locations through several applications.
These applications contain the authentication process to let users access data securely.
An example of a Centralized database can be Central Library that carries a central
database of each library in a college/university.

Advantages of Centralized Database

o It has decreased the risk of data management, i.e., manipulation of data will not
affect the core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data
standards.
o It is less costly because fewer vendors are required to handle the data sets.

Disadvantages of Centralized Database

o The size of the centralized database is large, which increases the response time for
fetching the data.
o It is not easy to update such an extensive database system.
3

o If any server failure occurs, entire data will be lost, which could be a huge loss.

2) Distributed Database

Unlike a centralized database system, in distributed systems, data is distributed among

different database systems of an organization. These database systems are connected
via communication links. Such links help the end-users to access the data
easily. Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc.

We can further divide a distributed database system into:

o Homogeneous DDB: Those database systems which execute on the same

operating system and use the same application process and carry the same
hardware devices.
o Heterogeneous DDB: Those database systems which execute on different
operating systems under different application procedures, and carries different
hardware devices.

Advantages of Distributed Database

o Modular development is possible in a distributed database, i.e., the system can be

expanded by including new computers and connecting them to the distributed
system.
o One server failure will not affect the entire data set.

3) Relational Database

This database is based on the relational data model, which stores data in the form of
rows(tuple) and columns(attributes), and together forms a table(relation). A relational
database uses SQL for storing, manipulating, as well as maintaining the data. E.F. Codd
4

invented the database in 1970. Each table in the database carries a key that makes the
data unique from others. Examples of Relational databases are MySQL, Microsoft SQL
Server, Oracle, etc.

Properties of Relational Database

There are following four commonly known properties of a relational model known as
ACID properties, where:

A means Atomicity: This ensures the data operation will complete either with success
or with failure. It follows the 'all or nothing' strategy. For example, a transaction will
either be committed or will abort.

C means Consistency: If we perform any operation over the data, its value before and
after the operation should be preserved. For example, the account balance before and
after the transaction should be correct, i.e., it should remain conserved.

I means Isolation: There can be concurrent users for accessing data at the same time
from the database. Thus, isolation between the data should remain isolated. For
example, when multiple transactions occur at the same time, one transaction effects
should not be visible to the other transactions in the database.

D means Durability: It ensures that once it completes the operation and commits the
data, data changes should remain permanent.

4) NoSQL Database

Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data
sets. It is not a relational database as it stores data not only in tabular form but in
several different ways. It came into existence when the demand for building modern
applications increased. Thus, NoSQL presented a wide variety of database technologies
in response to the demands. We can further divide a NoSQL database into the following
four types:
5

a. Key-value storage: It is the simplest type of database storage where it stores

every single item as a key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-
like document. It helps developers in storing data by using the same document-
model format as used in the application code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like
structure. Most commonly, social networking websites use the graph database.
d. Wide-column stores: It is similar to the data represented in relational databases.
Here, data is stored in large columns together, instead of storing in rows.

Advantages of NoSQL Database

o It enables good productivity in the application development as it is not required to

store data in a structured format.
o It is a better option for managing and handling large data sets.
o It provides high scalability.
o Users can quickly access data from the database through key-value.

5) Cloud Database

A type of database where data is stored in a virtual environment and executes over the
cloud computing platform. It provides users with various cloud computing services
(SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud platforms,
but the best options are:

o Amazon Web Services(AWS)

o Microsoft Azure
o Kamatera
o PhonixNAP
o ScienceSoft
o Google Cloud SQL, etc.

6) Object-oriented Databases

The type of database that uses the object-based data model approach for storing data in
the database system. The data is represented and stored as objects which are similar to
the objects used in the object-oriented programming language.

7) Hierarchical Databases

It is the type of database that stores data in the form of parent-children relationship
nodes. Here, it organizes data in a tree-like structure.

Data get stored in the

form of records that are connected via links. Each child record in the tree will contain
only one parent. On the other hand, each parent record can have multiple child records.

8) Network Databases

It is the database that typically follows the network data model. Here, the representation
of data is in the form of nodes connected via links between them. Unlike the hierarchical
database, it allows each record to have multiple children and parent nodes to form a
generalized graph structure.

9) Personal Database

Collecting and storing data on the user's system defines a Personal Database. This
database is basically designed for a single user.

Advantage of Personal Database

o It is simple and easy to handle.

o It occupies less storage space as it is small in size.

10) Operational DatabaseThe type of database which creates and updates the database
in real-time. It is basically designed for executing and handling the daily data operations
in several businesses. For example, An organization uses operational databases for
managing per day transactions.

11) Enterprise DatabaseLarge organizations or enterprises use this database for

managing a massive amount of data. It helps organizations to increase and improve
their efficiency. Such a database allows simultaneous access to users.

Advantages of Enterprise Database:

o Multi processes are supportable over the Enterprise database.

o It allows executing parallel queries on the system.

DBMS applications
Applications where we use Database Management Systems are:

 Telecom: There is a database to keeps track of the information regarding calls

made, network usage, customer details etc. Without the database systems it is hard
to maintain that huge amount of data that keeps updating every millisecond.
 Industry: Where it is a manufacturing unit, warehouse or distribution centre, each
one needs a database to keep the records of ins and outs. For example distribution
centre should keep a track of the product units that supplied into the centre as well
as the products that got delivered out from the distribution centre on each day; this
is where DBMS comes into picture.
 Banking System: For storing customer info, tracking day to day credit and debit
transactions, generating bank statements etc. All this work has been done with the
help of Database management systems. Also, banking system needs security of
data as the data is sensitive, this is efficiently taken care by the DBMS systems.
 Sales: To store customer information, production information and invoice details.
Using DBMS, you can track, manage and generate historical data to analyse the
sales data.
 Airlines: To travel though airlines, we make early reservations, this reservation
information along with flight schedule is stored in database. This is where the real-
time update of data is necessary as a flight seat reserved for one passenger
should not be allocated to another passenger, this is easily handled by the
DBMS systems as the data updates are in real time and fast.
 Education sector: Database systems are frequently used in schools and colleges
to store and retrieve the data regarding student details, staff details, course details,
exam details, payroll data, attendance details, fees details etc. There is a large
amount of inter-related data that needs to be stored and retrieved in an efficient
manner.
 Online shopping: You must be aware of the online shopping websites such as
Amazon, Flipkart etc. These sites store the product information, your addresses and
preferences, credit details and provide you the relevant list of products based on
your query. All this involves a Database management system. Along with
8

managing the vast catalogue of items, there is a need to secure the user
private information such as bank & card details. All this is taken care of by
database management systems.

Features of Database Management System

(DBMS)

Minimum Duplication and Redundancy

Because there are many users who use the database so chances of data duplicity are
very high. As in database management system, data files are shared that in turns
minimizes data duplication and redundancy. All the information in database
management system occurs only once so chances of duplicity are very less.
Saves Storage Space and Cost

All the Database management systems have a lot of data to save. But proper integration
of data saves much more space in DBMS. Companies are paying so much amount of
money to store data. If they have managed data to storing then it will save their cost of
storing data and data entry.
Anyone Can Work on It

Users who are not having any technical skills can work on database management
system. The query language provided by DBMS is so easy to understand. If you want to
update, insert, delete and search any record then it is very easy with the help of queries
provided by DBMS. Any non programming user can do this without any help of skilled
programmer.
Large Database Maintenance

Large databases of big companies can be maintained only by database management

system. These databases require lots of security and other feature like backup and
recovery. All these features are contained in DBMS. It can maintain a database with lots
of data and information.
of Database Management System
Provides High Level of Security

Security is a very big concern for all the organizations who are handling a large amount
of data. DBMS doesn’t give the full access of database except DBA or head of the
department. They are able to alter the database and all the users are created by them
so security level of DBMS becomes so high. No other person or user can access the full
database; all of them have restrictions according to their work.

Permanent Storage of Data

DBMS stores all the data files permanently and there is no chance of any loss of data. If
somehow the data get lost then there is a backup and recovery method too that can
save organization’s data files. So no need to worry about data loss in DBMS.

Multi-user Access

In DBMS, multiple users can access all kind data and information stored in one data
store. There are certain limits that users can access or view particular data according to
the rights given to them. This increases the security and privacy of data for users
because they will have their own interface to access data.
So all these are the features of database management system. If you have problem and
want to ask anything about DBMS then please comment below, we will surely help you.

DBMS Providers
Oracle remains the most popular provider of database management systems,
followed by Microsoft SQL and MySQL.

IBMDB2, SolarWinds Database Performance Analyzer, Amazon RDS, Hadoop,

and Maria DB are all top market leaders in the DBMS market as well.

A number of open-source providers with some free features also remain

competitive, such as Apache Cassandra and PostgreSQL.

Components of DBMS
The database management system can be divided into five major components, they are:

1. Hardware

2. Software

3. Data

4. Procedures

5. Database Access Language

Let's have a simple diagram to see how they all fit together to form a database management system.
10

DBMS Components: Hardware

When we say Hardware, we mean computer, hard disks, I/O channels for data, and any other physical
component involved before any data is successfully stored into the memory.

When we run Oracle or MySQL on our personal computer, then our computer's Hard Disk, our Keyboard
using which we type in all the commands, our computer's RAM, ROM all become a part of the DBMS
hardware.

DBMS Components: Software

This is the main component, as this is the program which controls everything. The DBMS software is more
like a wrapper around the physical database, which provides us with an easy-to-use interface to store,
access and update data.

The DBMS software is capable of understanding the Database Access Language and intrepret it into actual
database commands to execute them on the DB.

DBMS Components: Data

Data is that resource, for which DBMS was designed. The motive behind the creation of DBMS was to store
and utilise data.

In a typical Database, the user saved Data is present and meta data is stored.
11

Metadata is data about the data. This is information stored by the DBMS to better understand the data
stored in it.

For example: When I store my Name in a database, the DBMS will store when the name was stored in
the database, what is the size of the name, is it stored as related data to some other data, or is it
independent, all this information is metadata.

DBMS Components: Procedures

Procedures refer to general instructions to use a database management system. This includes procedures
to setup and install a DBMS, To login and logout of DBMS software, to manage databases, to take backups,
generating reports etc.

DBMS Components: Database Access Language

Database Access Language is a simple language designed to write commands to access, insert, update
and delete data stored in any database.

A user can write commands in the Database Access Language and submit it to the DBMS for execution,
which is then translated and executed by the DBMS.

User can create new databases, tables, insert data, fetch stored data, update data and delete the data
using the access language.

Users

 Database Administrators: Database Administrator or DBA is the one who manages the complete
database management system. DBA takes care of the security of the DBMS, it's availability,
managing the license keys, managing user accounts and access etc.

 Application Programmer or Software Developer: This user group is involved in developing and
desiging the parts of DBMS.

 End User: These days all the modern applications, web or mobile, store user data. How do you
think they do it? Yes, applications are programmed in such a way that they collect user data and
store the data on DBMS systems running on their server. End users are the one who store, retrieve,
update and delete data.

DBMS Architecture

A Database Management system is not always directly available for users and applications to
access and store data in it. A Database Management system can be centralised(all the data stored at one
location), decentralised(multiple copies of database at different locations) or hierarchical, depending
upon its architecture.

1-tier DBMS architecture also exist, this is when the database is directly available to the user for using it
to store data. Generally such a setup is used for local application development, where programmers
communicate directly with the database for quick response.

Database Architecture is logically of two types:

1. 2-tier DBMS architecture

2. 3-tier DBMS architecture

2-tier DBMS Architecture

2-tier DBMS architecture includes an Application layer between the user and the DBMS, which is
responsible to communicate the user's request to the database management system and then send the
response from the DBMS to the user.

An application interface known as ODBC(Open Database Connectivity) provides an API that allow client
side program to call the DBMS. Most DBMS vendors provide ODBC drivers for their DBMS.

Such an architecture provides the DBMS extra security as it is not exposed to the End User directly.
Also, security can be improved by adding security and authentication checks in the Application layer too.

3-tier DBMS Architecture3-tier DBMS architecture is the most commonly used architecture for web

applications.

It is an extension of the 2-tier architecture. In the 2-tier architecture, we have an application layer which
can be accessed programatically to perform various operations on the DBMS. The application generally
understands the Database Access Language and processes end users requests to the DBMS.
13

In 3-tier architecture, an additional Presentation or GUI Layer is added, which provides a graphical user
interface for the End user to interact with the DBMS.

For the end user, the GUI layer is the Database System, and the end user has no idea about the
application layer and the DBMS system.

view of data in DBMS

Abstraction is one of the main features of database systems. Hiding irrelevant details from user
and providing abstract view of data to users, helps in easy and efficient user-
database interaction. In the previous tutorial, we discussed the three level of DBMS
architecture, The top level of that architecture is “view level”. The view level provides the “ view
of data” to the users and hides the irrelevant details such as data relationship, database
schema, constraints, security etc from the user.

To fully understand the view of data, you must have a basic knowledge of data abstraction and
instance & schema. Refer these two tutorials to learn them in detail.

1. Data abstraction:Database systems are made-up of complex data structures. To ease the
user interaction with database, the developers hide internal irrelevant details from users.
This process of hiding irrelevant details from user is called data abstraction.
2. Instance and schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema. The data stored in database at a
particular moment of time is called instance of database. Database schema defines the
variable declarations in tables that belong to a particular database; the value of
these variables at a moment of time is called the instance of that database.

Data Abstraction in DBMS

Database systems are made-up of complex data structures. To ease the user interaction with database,
the developers hide internal irrelevant details from users. This process of hiding irrelevant details from
user is called data abstraction. The term “irrelevant” used here with respect to the user, it doesn’t mean
that the hidden data is not relevant with regard to the whole database. It just means that the user is not
concerned about that data.

For example: When you are booking a train ticket, you are not concerned how data is processing at the
back end when you click “book ticket”, what processes are happening when you are doing online
payments. You are just concerned about the message that pops up when your ticket is successfully
booked. This doesn’t mean that the process happening at the back end is not relevant, it just means that
you as a user are not concerned what is happening in the database.
14

3. Three levels of abstraction

4.
Physical level: This is the lowest level of data abstraction. It describes how data is actually stored in
database. You can get the complex data structure details at this level.
5. Logical level: This is the middle level of 3-level data abstraction architecture. It describes what data is
stored in database.
6. View level: Highest level of data abstraction. This level describes the user interaction with database
system.
7. Example: Let’s say we are storing customer information in a customer table. At physical level these
records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in memory. These details
are often hidden from the programmers.
8. At the logical level these records can be described as fields and attributes along with their data types,
their relationship among each other can be logically implemented. The programmers generally work at
this level because they are aware of such things about database systems.
9. At view level, user just interact with system with the help of GUI and enter the details at the screen, they
are not aware of how the data is stored and what data is stored; such details are hidden from them.

Database users

Database users are the one who really use and take the benefits of database. There will be
different types of users depending on their need and way of accessing the database.
Application Programmers - They are the developers who interact with the database by
means of DML queries. These DML queries are written in the application programs like C, C++,
JAVA, Pascal [Link] queries are converted into object code to communicate with the
database.
15

For example, writing a C program to generate the report of employees who are working in
particular department will involve a query to fetch the data from database. It will include a
embedded SQL query in the C Program.
Sophisticated Users - They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request the
database. They directly interact with the database by means of query language like SQL. These
users will be scientists, engineers, analysts who thoroughly study SQL and DBMS to apply the
concepts in their requirement. In short, we can say this category includes designers and
developers of DBMS and SQL.
Specialized Users - These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to the
requirement.
Stand-alone Users - These users will have stand –alone database for their personal use.
These kinds of database will have readymade database packages which will have menus and
graphical interfaces.
Native Users - these are the users who use the existing application to interact with the
database. For example, online library system, ticket booking systems, ATMs etc. which has
existing application and users use them to interact with the database to fulfil their requests.
database administrator:

A database administrator, or DBA, is someone who is in charge of making sure a

database runs smoothly. As a challenging role that requires focus, logic, and an
enthusiastic personality that can cope under pressure, the job necessitates a variety of
skills. DBAs must work within an organization to monitor, repair, and develop databases.

1. Software installation and Maintenance

A DBA is frequently involved in the initial installation and configuration of a new Oracle,
SQL Server, or other databases. The system administrator configures the database
server’s hardware and implements the operating system, after which the DBA installs
and configures the database software. The DBA is in charge of ongoing maintenance,
such as updates and patches.

In addition, if a new server is implemented, the DBA is in charge of transferring data

from the existing system to the new platform.

2. Managing Data Integrity

DBAs primarily handle the overall integrity of a company’s database. They make sure
that the Data integrity is carefully managed because it protects data from unauthorized
use. DBAs manage data relationships to ensure data consistency.

3. Takes Care of Data Extraction, Transformation, and Loading

DBAs are responsible for Data extraction, transformation, and loading, also known as
(ETL), which refers to the efficient import of large amount of data extracted from
multiple systems into a data warehouse environment. The external data is cleaned and
transformed to fit the required format before being imported into a central repository.

4. Monitoring Performance
16

Only implementing a database is not the task of the database administrator. Once the
database is implemented, they are required to monitor databases for performance
issues. If a system component slows down processing, the DBA may need to change the
software configuration or add more hardware capacity. There are numerous monitoring
tools available, and understanding what they need to track to improve the system is part
of the DBA’s job.

5. Data Handling

Each company’s success today revolves around massive databases. Companies

nowadays maintain massive databases containing unstructured data types such as
images, documents, or sound and video files. Managing an extensive database (VLDB)
may necessitate higher-level skills, as well as additional monitoring and tuning, which a
DBA possesses.

6. Create a Database Backup Plan

DBAs create backup and recovery plans and procedures as per the industry standards.
Not only that, but DBAs make certain that all necessary steps are taken. DBAs are
responsible for ensuring that everything is completed on time, in addition to taking the
required precautions to keep data safe.

7. Database Recovery

The DBA’s responsibility in the event of a server failure or other type of data loss is to
restore lost data to the system using existing backups.
Different types of failures may necessitate different recovery strategies, and a DBA
performs his duties while keeping the necessary requirements in mind. Furthermore, as
technology advances, it becomes crucial for a DBA to backup databases to the cloud.

8. Database Security
One of the most critical responsibilities of a DBA is identifying and correcting any flaws in
the database software. No system is entirely secure; however, DBAs mitigate risks by
implementing best practices. A DBA must be able to identify potential flaws in the
database software and the overall system of the company and take appropriate steps to
mitigate risks.

9. Database Integrity

DBAs are primarily responsible for the overall integrity of a company’s database. This
includes putting the database in place, keeping it safe from loss and corruption, making
it easily accessible, ensuring it works properly, and constantly tweaking it for ease of use
and maximum productivity. In addition, the database administrator is also in charge of
training eligible employees on how to access and use the database so that they can
perform their duties.

10. Database Accessibility

Setting up employee access is a critical component of database security. DBAs decide

who has access and what kind of access they have. They create a subschema to control
database accessibility.
17

They also determine which users will have access to the database and which users will
use data. Without the permission of the DBA, no user has the authority to access the
database.

11. Provides Support to Users

If a user requires assistance at any time, it is the DBA’s responsibility to assist him. The
DBA provides complete support to users who are new to the database.

12. Troubleshooting

In the event of a problem, DBAs job is to troubleshoot it immediately. Whether a DBA

need to quickly restore lost data or fix a problem to limit damage, a DBA must be able to
quickly understand and respond to concerns when they occur.

ER diagram of recruitment database

Instance and schema in DBMS

BY CHAITANYA SINGH

In this guide, you will learn about instance and schema in DBMS.

DBMS Schema
Definition of schema: Design of a database is called the schema. For example:
An employee table in database exists with the following attributes:

EMP_NAME EMP_ID EMP_ADDRESS EMP_CONTACT

-------- ------ ----------- -----------
This is the schema of the employee table. Schema defines the attributes of
tables in the database. Schema is of three types: Physical schema, logical
schema and view schema.

 Schema represents the logical view of the database. It helps you

understand what data needs to go where.
 Schema can be represented by a diagram as shown below.
 Schema helps the database users to understand the relationship
between data. This helps in efficiently performing operations on database
such as insert, update, delete, search etc.

In the following diagram, we have a schema that shows the relationship

between three tables: Course, Student and Section. The diagram only shows
the design of the database, it doesn’t show the data present in those tables.
Schema is only a structural view(design) of a database as shown in the diagram
19

below.

The design of a database at physical level is called physical schema, how the
data stored in blocks of storage is described at this level.

Design of database at logical level is called logical schema, programmers and

database administrators work at this level, at this level data can be described as
certain types of data records gets stored in data structures, however the internal
details such as implementation of data structure is hidden at this level (available
at physical level).

Design of database at view level is called view schema. This generally describes
end user interaction with database systems.

To learn more about these schemas, refer 3 level data abstraction architecture.

DBMS Instance
Definition of instance: The data stored in database at a particular moment of
time is called instance of database. Database schema defines the attributes in
tables that belong to a particular database. The value of these attributes at a
moment of time is called the instance of that database.
20

For example, we have seen the schema of table “employee” above. Let’s see the
table with the data now. At this moment the table contains two rows (records).
This is the the current instance of the table “employee” because this is the data
that is stored in this table at this particular moment of time.

EMP_NAME EMP_ID EMP_ADDRESS EMP_CONTACT

------- ------ ----------- -----------
Chaitanya 101 Noida 95********
Ajeet 102 Delhi 99********
Let’s take another example: Let’s say we have a single table student in the
database, today the table has 100 records, so today the instance of the database
has 100 records. We are going to add another 100 records in this table by
tomorrow so the instance of database tomorrow will have 200 records in table. In
short, at a particular moment the data stored in database is called the instance,
this changes over time as and when we add, delete or update data in the
database.

Data Base Administrator:

A database administrator's (DBA) primary job is to ensure that data is available, protected from loss and
corruption, and easily accessible as needed. Below are some of the chief responsibilities that make up the
day-to-day work of a DBA

1. Software installation and Maintenance

A DBA often collaborates on the initial installation and configuration of a new Oracle, SQL Server etc
database. The system administrator sets up hardware and deploys the operating system for the database
server, then the DBA installs the database software and configures it for use. As updates and patches are
required, the DBA handles this on-going maintenance.

And if a new server is needed, the DBA handles the transfer of data from the existing system to the new
platform.

2. Data Extraction, Transformation, and Loading

Known as ETL, data extraction, transformation, and loading refers to efficiently importing large volumes of
data that have been extracted from multiple systems into a data warehouse environment.

This external data is cleaned up and transformed to fit the desired format so that it can be imported into a
central repository.

3. Specialised Data Handling

Today’s databases can be massive and may contain unstructured data types such as images, documents,
or sound and video files. Managing a very large database (VLDB) may require higher-level skills and
additional monitoring and tuning to maintain efficiency.

4. Database Backup and Recovery

DBAs create backup and recovery plans and procedures based on industry best practices, then make sure
that the necessary steps are followed. Backups cost time and money, so the DBA may have to persuade
management to take necessary precautions to preserve data.

System admins or other personnel may actually create the backups, but it is the DBA’s responsibility to
make sure that everything is done on schedule.

In the case of a server failure or other form of data loss, the DBA will use existing backups to restore lost
information to the system

5. Security
A DBA needs to know potential weaknesses of the database software and the company’s overall system
and work to minimise risks. No system is one hundred per cent immune to attacks, but implementing best
practices can minimise risks.

In the case of a security breach or irregularity, the DBA can consult audit logs to see who has done what to
the data. Audit trails are also important when working with regulated data.

6. Authentication
Setting up employee access is an important aspect of database security. DBAs control who has access and
what type of access they are allowed. For instance, a user may have permission to see only certain pieces
of information, or they may be denied the ability to make changes to the system.

7. Capacity Planning
The DBA needs to know how large the database currently is and how fast it is growing in order to make
predictions about future needs. Storage refers to how much room the database takes up in server and
backup space. Capacity refers to usage level.

If the company is growing quickly and adding many new users, the DBA will have to create the capacity to
handle the extra workload.

8. Performance Monitoring
Monitoring databases for performance issues is part of the on-going system maintenance a DBA performs.
If some part of the system is slowing down processing, the DBA may need to make configuration changes
to the software or add additional hardware capacity

Chapter-2

Data Model
Data Model gives us an idea that how the final system will look like after its complete implementation.
It defines the data elements and the relationships between the data elements. Data Models are used to
show how data is stored, connected, accessed and updated in the database management system. Here,
we use a set of symbols and text to represent the information so that members of the organisation can
communicate and understand it. Though there are many data models being used nowadays but the
22

Relational model is the most widely used model. Apart from the Relational model, there are many other
types of data models about which we will study in details in this blog. Some of the Data Models in
DBMS are:

Some of the Data Models in DBMS are:

1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model

Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the hierarchical tree
structure. The hierarchy starts from the root which has root data and then it expands in the form of a
tree adding child node to the parent node. This model easily represents some of the real-world
relationships like food recipes, sitemap of a website etc. Example: We can represent the relationship
between the shoes present on a shopping website in the following way:

Features of a Hierarchical Model

1. One-to-many relationship: The data here is organised in a tree-like structure where the one-
to-many relationship is between the datatypes. Also, there can be only one path from parent to
any node. Example: In the above example, if we want to go to the node sneakers we only have
one path to reach there i.e through men's shoes node.
23

2. Parent-Child Relationship: Each child node has a parent node but a parent node can have
more than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used to navigate
between the stored data. Example: In the above example the 'shoes' node points to the two other
nodes 'women shoes' node and 'men's shoes' node.
Advantages of Hierarchical Model

 It is very simple and fast to traverse through a tree-like structure.

 Any change in the parent node is automatically reflected in the child node so, the integrity of
data is maintained.
Disadvantages of Hierarchical Model

 Complex relationships are not supported.

 As it does not support more than one parent of the child node so if we have some complex
relationship where a child node needs to have two parent node then that can't be represented
using this model.
 If a parent node is deleted then the child node is automatically deleted.

Network Model
This model is an extension of the hierarchical model. It was the most popular model before the
relational model. This model is the same as the hierarchical model, the only difference is that a record
can have more than one parent. It replaces the hierarchical tree with a graph. Example: In the
example below we can see that node student has two parents i.e. CSE Department and Library. This
was earlier not possible in the hierarchical model.

Features of a Network Model

1. Ability to Merge more Relationships: In this model, as there are more relationships so data
is more related. This model has the ability to manage one-to-one relationships as well as many-
to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to the same
record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help of the
circular linked list. The current position is maintained with the help of a program and this
position navigates through the records according to the relationship.
Advantages of Network Model

 The data can be accessed faster as compared to the hierarchical model. This is because the data
is more related in the network model and there can be more than one path to reach a particular
node. So the data can be accessed in many ways.
 As there is a parent-child relationship so data integrity is present. Any change in parent record
is reflected in the child record.
Disadvantages of Network Model

 As more and more relationships need to be handled the system might get complex. So, a user
must be having detailed knowledge of the model to work with the model.
 Any change like updation, deletion, insertion is very complex.

Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this model, we
represent the real-world problem in the pictorial form to make it easy for the stakeholders to
understand. It is also very easy for the developers to understand the system by just looking at the ER
diagram. We use the ER diagram as a visual tool to represent an ER Model. ER diagram has the
following three components:

 Entities: Entity is a real-world thing. It can be a person, place, or even a

concept. Example: Teachers, Students, Course, Building, Department, etc are some of the
entities of a School Management System.
 Attributes: An entity contains a real-world property called attribute. This is the characteristics
of that attribute. Example: The entity teacher has the property like teacher id, salary, age, etc.
 Relationship: Relationship tells how two attributes are related. Example: Teacher works for a
department.

Example:
25

In the above diagram, the entities are Teacher and Department. The attributes of Teacher entity are
Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The attributes of entity Department entity
are Dept_id, Dept_name. The two entities are connected using the relationship. Here, each teacher
works for a department.

Features of ER Model
 Graphical Representation for Better Understanding: It is very easy and simple to
understand so it can be used by the developers to communicate with the stakeholders.
 ER Diagram: ER diagram is used as a visual tool for representing the model.
 Database Design: This model helps the database designers to build the database and is widely
used in database design.
Advantages of ER Model

 Simple: Conceptually ER Model is very easy to build. If we know the relationship between the
attributes and the entities we can easily build the ER Diagram for the model.
 Effective Communication Tool: This model is used widely by the database designers for
communicating their ideas.
 Easy Conversion to any Model: This model maps well to the relational model and can be easily
converted relational model by converting the ER model to the table. This model can also be
converted to any other model like network model, hierarchical model etc.
Disadvatages of ER Model

 No industry standard for notation: There is no industry standard for developing an ER

model. So one developer might use notations which are not understood by other developers.
 Hidden information: Some information might be lost or hidden in the ER model. As it is a high-
level view so there are chances that some details of information might be hidden .

Importance of data modelling.

Data constitute the most basic information units employed by a system. Applications are created to
manage data and to help transform data into information.
But data are viewed in different ways by different people. So that there is a huge importance of data
modeling in DBMS.

For example, contrast the (data) view of a company manager with that of a company clerk. Although the
manager and the clerk both work for the same company, the manager is more likely to have an enterprise-
wide view of company data than the clerk.
Applications programmers have yet another view of data, being more concerned with data location,
formatting, and specific reporting requirements.
Basically, applications programmers translate company policies and procedures from a variety of sources
into appropriate interfaces, reports, and query screens.
When a good database blueprint is available, it does not matter that an applications programmer’s view of
the data is different from that of the manager and/or the end user. Conversely, when a good database
blueprint is not available, problems are likely to occur.
For instance, an inventory management program and an order entry system may use conflicting product-
numbering schemes, thereby costing the company thousands (or even millions) of dollars. The data model
is an abstraction; you cannot draw the required data out of the data model.

Overview of Database design

Database design is the organization of data according to a database model. The designer determines what data
must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database
model. ... Database design involves classifying data and identifying interrelationships.

Data Base Development life Cycle.

PHASES OF DATA BASE DESIGN

1. Conceptual design

When every data requirement is stored and analyzed, the next thing that we need to do is creating a conceptual
database plan. Here, a highly leveled conceptual data model is used. This phase is called conceptual
design.

When the conceptual design phase is in progress, the basic data modeling operations can be deployed to define
the high-level user operations that are noted during analysis of the functions.

2. Logical Design

The logical phase of database design is also called the data modeling mapping phase. This phase
gives us a result of relation schemas. The basis for these schemas is the ER or the Class Diagram.

To create the relation schemas is mainly mechanical operation. There are rules for transferring the ER model or
class diagram to relation schemas.

3. Normalization

Normalization is, in fact, the last piece of the logical design puzzle. The main purpose of normalization is to
remove superfluity and every other potential anomaly during the update.

Normalization in database design is a way to change the relation schema to reduce any superfluity. With every
normalization phase, a new table is added to the database.

4. Physical Design

The last phase of database design is the physical design phase. In this phase, we implement the database
design. Here, a DBMS (Database Management System) must be chosen to use.
28

For instance, different DBM systems have different names for every datatype and they have different data
types.

SQL clauses are written to help in creating the database. Also, the indexes and the integrity constraints (rules)
are defined in this phase. And finally the data is added and the database can finally be tested.

Conceptual Data Base Design

The design phase is where the requirements identified in the previous phase are used as the basis to develop the
new system. Another way of putting it is that the business understanding of the data structures is converted to a
technical understanding. The what questions ("What data are required? What are the problems to be solved?") are
replaced by the how questions ("How will the data be structured? How is the data to be accessed?")

This phase consists of three parts: the conceptual design, the logical design and the physical design. Some
methodologies merge the logical design phase into the other two phases. This section is not aimed at being a
definitive discussion of database design methodologies (there are whole books written on that!); rather it aims to
introduce you to the topic.

Conceptual design
The purpose of the conceptual design phase is to build a conceptual model based upon the previously identified
requirements, but closer to the final physical model. A commonly-used conceptual model is called an entity-
relationship model.

Entities and attributes

Entities are basically people, places, or things you want to keep information about. For example, a library system
may have the book, library and borrower entities. Learning to identify what should be an entity, what should be a
number of entities, and what should be an attribute of an entity takes practice, but there are some good rules of
thumb. The following questions can help to identify whether something is an entity:

The following are examples of entities involving a university with possible attributes in parentheses.

 Course (name, code, course prerequisites)

 Student (first_name, surname, address, age)
 Book (title, ISBN, price, quantity in stock)

An instance of an entity is one particular occurrence of that entity. For example, the student Rudolf Sono is one
instance of the student entity. There will probably be many instances. If there is only one instance, consider whether
the entity is warranted. The top level usually does not warrant an entity. For example, if the system is being
developed for a particular university, university will not be an entity because the whole system is for that one
university. However, if the system was developed to track legislation at all universities in the country,
then university would be a valid entity.

Relationships
29

Entities are related in certain ways. For example, a borrower may belong to a library and can take out books. A book
can be found in a particular library. Understanding what you are storing data about, and how the data relate, leads
you a large part of the way to a physical implementation in the database.

There are a number of possible relationships:

Mandatory

For each instance of entity A, there must exist one or more instances of entity B. This does not necessarily mean
that for each instance of entity B, there must exist one or more instances of entity A. Relationships are optional or
mandatory in one direction only, so the A-to-B relationship can be optional, while the B-to-A relationship is
mandatory.

Optional

For each instance of entity A, there may or may not exist instances of entity B.

One-to-one (1:1)

This is where for each instance of entity A, there exists one instance of entity B, and vice-versa. If the relationship is
optional, there can exist zero or one instances, and if the relationship is mandatory, there exists one and only one
instance of the associated entity.

One-to-many (1:M)

For each instance of entity A, many instances of entity B can exist, which for each instance of entity B, only one
instance of entity A exists. Again, these can be optional or mandatory relationships.

Many-to-many (M:N)

For each instance of entity A, many instances of entity B can exist, and vice versa. These can be optional or
mandatory relationships.

There are numerous ways of showing these relationships. The image below shows student and course entities. In
this case, each student must have registered for at least one course, but a course does not necessarily have to
have students registered. The student-to-course relationship is mandatory, and the course-to-student relationship is

optional.

The image below shows invoice_line and product entities. Each invoice line must have at least one product (but no
more than one); however each product can appear on many invoice lines, or none at all. The invoice_line-to-
product relationship is mandatory, while the product-to-invoice_line relationship is optional.
30

The figure below shows husband and wife entities. In this system (others are of course possible), each husband
must have one and only one wife, and each wife must have one, and only one, husband. Both relationships are
mandatory.

An entity can also have a relationship with itself. Such an entity is called a recursive entity. Take a person entity. If
you're interested in storing data about which people are brothers, you wlll have an "is brother to" relationship. In this
case, the relationship is an M:N relationship.

Conversely, a weak entity is an entity that cannot exist without another entity. For example, in a school,
the scholar entity is related to the weak entity parent/guardian. Without the scholar, the parent or guardian cannot
exist in the system. Weak entities usually derive their primary key, in part or in totality, from the associated
entity. parent/guardian could take the primary key from the scholar table as part of its primary key (or the entire key
if the system only stored one parent/guardian per scholar).

The term connectivity refers to the relationship classification.

The term cardinality refers to the specific number of instances possible for a relationship. Cardinality limits list the
minimum and maximum possible occurrences of the associated entity. In the husband and wife example, the
cardinality limit is (1,1), and in the case of a student who can take between one and eight courses, the cardinality
limits would be represented as (1,8).

Developing an entity-relationship diagram

An entity-relationship diagram models how the entities relate to each other. It's made up of multiple relationships,
the kind shown in the examples above. In general, these entities go on to become the database tables.

The first step in developing the diagram is to identify all the entities in the system. In the initial stage, it is not
necessary to identify the attributes, but this may help to clarify matters if the designer is unsure about some of the
31

entities. Once the entities are listed, relationships between these entities are identified and modeled according to
their type: one-to-many, optional and so on. There are many software packages that can assist in drawing an entity-
relationship diagram, but any graphical package should suffice.

Once the initial entity-relationship diagram has been drawn, it is often shown to the stakeholders. Entity-relationship
diagrams are easy for non-technical people to understand, especially when guided through the process. This can
help identify any errors that have crept in. Part of the reason for modeling is that models are much easier to
understand than pages of text, and they are much more likely to be viewed by stakeholders, which reduces the
chances of errors slipping through to the next stage, when they may be more difficult to fix.

Once the diagram has been approved, the next stage is to replace many-to-many relationships with two one-to-
many relationships. A DBMS cannot directly implement many-to-many relationships, so they are decomposed into
two smaller relationships. To achieve this, you have to create an intersection, or composite entity type. Because
intersection entities are less "real-world" than ordinary entities, they are sometimes difficult to name. In this case,
you can name them according to the two entities being intersected. For example, you can intersect the many-to-
many relationship between student and course by a student-course entity.

The same applies even if the entity is recursive. The person entity that has an M:N relationship "is brother to" also
needs an intersection entity. You can come up with a good name for the intersection entity in this case: brother. This
entity would contain two fields, one for each person of the brother relationship — in other words, the primary key of
the first brother and the primary key of the other brother.

Notation of ER diagram
32

Database can be represented using the notations. In ER diagram, many notations are
used to express the cardinality. These notations are as follows:

Fig: Notations of ER diagram

Chapter-3
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the form of a
two-dimensional table. All the information is stored in the form of row and columns. The basic structure
of a relational model is [Link] tables are also called relations in the relational model. Example: In
this example, we have an Employee table.

Features of Relational Model

 Tuples: Each row in the table is called tuple. A row contains all the information about any
instance of the object. In the above example, each row has all the information about any specific
individual like the first row has information about John.
 Attribute or field: Attributes are the property which defines the table or relation. The values of
the attribute should be from the same domain. In the above example, we have different
attributes of the employee like Salary, Mobile_no, etc.

Advnatages of Relational Model

 Simple: This model is more simple as compared to the network and hierarchical model.
 Scalable: This model can be easily scaled as we can add as many rows and columns we want.
 Structural Independence: We can make changes in database structure without changing the
way to access the data. When we can make changes to the database structure without affecting
the capability to DBMS to access the data we can say that structural independence has been
achieved.

Disadvantages of Relatinal Model

 Hardware Overheads: For hiding the complexities and making things easier for the user this
model requires more powerful hardware computers and data storage devices.
 Bad Design: As the relational model is very easy to design and use. So the users don't need to
know how the data is stored in order to access it. This ease of design can lead to the
development of a poor database which would slow down if the database grows.
But all these disadvantages are minor as compared to the advantages of the relational model. These
problems can be avoided with the help of proper implementation and organisation.
34

Object-Oriented Data Model

The real-world problems are more closely represented through the object-oriented data model. In this
model, both the data and relationship are present in a single structure known as an object. We can
store audio, video, images, etc in the database which was not possible in the relational model(although
you can store audio and video in relational database, it is adviced not to store in the relational
database). In this model, two are more objects are connected through links. We use this link to relate
one object to other objects. This can be understood by the example given below.

In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name, Job_title of the
employee and the methods which will be performed by that object are stored as a single object. The
two objects are connected through a common attribute i.e the Department_id and the communication
between these two will be done with the help of this common id.

Database design is a framework that the database uses for planning, storing and managing data in companies and organizations. Data
and database design are the lifeblood of every company.

We can say that the consistency of a data is achieved when the database is designed in such a way so it can store only useful and often
most required data.

In this article, we will explain you the main phases that create database design and their roles in the design.

Relational data model is the primary data model, which is used widely around the world for data storage and
processing. This model is simple and it has all the properties and capabilities required to process data with storage
efficiency.

Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores the relation among
entities. A table has rows and columns, where rows represents records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation instance. Relation
instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and their names.
Relation key − Each row has one or more attributes, known as relation key, which can identify the row in the relation
(table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.
35

Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are called Relational
Integrity Constraints. There are three main integrity constraints −

 Key constraints
 Domain constraints
 Referential integrity constraints

Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This minimal
subset of attributes is called key for that relation. If there are more than one such minimal subsets, these are
called candidate keys.
Key constraints force that −
 in a relation with a key attribute, no two tuples can have identical values for key attributes.
 a key attribute can not have NULL values.
Key constraints are also referred to as Entity Constraints.

Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive integer. The same
constraints have been tried to employ on the attributes of a relation. Every attribute is bound to have a specific range of
values. For example, age cannot be less than zero and telephone numbers cannot contain a digit outside 0-9.

Referential integrity Constraints

Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key attribute of a relation that
can be referred in other relation.
Referential integrity constraint states that if a relation refers to a key attribute of a different or same relation, then that key
element must exist.

Characteristics of Relational Database Model

As we know we have several relations in a database. Now, each relation must be uniquely identified. If it is not so, then it
would create a lot of confusion. Here, we will discuss some characteristics that when followed will automatically make a
relation distinct in a database.

1. Each relation in a database must have a distinct or unique name which would separate it from the other relations in
a database.

2. A relation must not have two attributes with the same name. Each attribute must have a distinct name.

3. Duplicate tuples must not be present in a relation.

4. Each tuple must have exactly one data value for an attribute. For example, below in the first table, you can see that
for Roll_No. 265 we have enrolled two students Jhoson and Charles, this would not work. We must have only one student
for one Roll_No.

5. Tuples in a relation do not have to follow a significant order as the relation is not order-sensitive.

6. Similarly, the attributes of a relation also do not have to follow certain ordering, it’s up to the developer to decide the
ordering of attributes.

Relational Model Constraints

Relational model constraints are restrictions specified to the data values in the relational database. Initially, we will
describe the constraints on the database, they are categorized as follows:

 Inherent Model-Based Constraints: The constraints that are implicit in a data model are inherent model-based
constraints. For example, a relation in a database must not have duplicate tuples, there is no constraint in the ordering of
the tuples and attributes.
37

 Schema-Based Constraints: The constraints that are specified while defining the schema of a database using DDL are
schema-based constraints. They are further categorized as domain constraints, key constraints, entity integrity
constraints, referential integrity constraints and constraints on Null Value.
 Application-based Constraints: The constraints that cannot be applied while defining the database schema are
expressed in application programs. For example, the salary of an employee cannot be more than his supervisor.

Now let us explore the Schema-based constraints in detail:

 Domain Constraints:
Each attribute in a tuple is declared to be of a particular domain (for example, integer, character, Boolean, String, etc.)
which specifies a constraint on the values that an attribute can take.
 Key Constraint and Constraint on Null Values:
In relation, a key can either be a single attribute or a subgroup of attributes that can recognize a particular tuple in a
relation. Now, the key constraint specifies that a key (attribute/subset of attribute) must not have the same set of values
for the tuples in a relation.
The constraint on NULL values defines whether an attribute is allowed to carry Null value or not. For example, in a
student tuple, its name attribute must be NOT NULL.
 Entity Integrity Constraint:
Entity integrity constraint specifies that a primary key of a tuple can never be NULL. As primary key used to identify
individual tuple in a relation.
 Referential Integrity Constraint:
The referential integrity constraint holds if the foreign key of relation R1 that refers to the relation R2 satisfies following
two conditions:
1. The set of attributes that form foreign key of relation R1 should have the same domain as the primary key of the
referenced relation R2.
2. In the current state, the set of values of the foreign key in tuple t1 of relation R1 must match a primary key value in
referenced relation R2 or it could be NULL.

Advantages and Disadvantages

Advantages:
1. It is the simplest and easy to use, data model.
2. It hides the physical storage details from the database developers and database users.
3. It is scalable as you can keep adding records and attributes to records in a database.

Disadvantages:
Taking an account of the advantages, the disadvantages are negligible.

Key Takeaways:
 Relational data model implements the database schema of the relational database.
 The relational model is also termed as a record-based model as it stores the data in fixed-format records (tuples) of
various types.
 A relation is a table whose columns indicates the attributes and rows indicates the tuples/entities/records.
 Many relations together form a relational database.
 The relational model has some constraints on the database schema and data values in the database which we have
discussed in the content.

So, this is all about the relational data model. Today it is widely used to design the database systems. Majority of
database system today are constructed using the relational data model.

Advantages
The primary benefit of the relational database approach is the ability to create meaningful information by joining
thetables. Joining tables allows you to understand the relationships between the data, or how the tables connect. SQL
includes the ability to count, add, group, and also combine queries
38

1. Speed
Even though a relational database is poor in terms of performance, still its speed is considerably higher
because of its ease and simplicity. And also various optimizations that is included in a relational database
further increases its speed. So all the applications will run with appropriate speed when used in a relational
database.
2. Security
Since there are several tables in a relational database, certain tables can be made to be confidential.
These tables are protected with username and password such that only authorized users will be able to
access them. The users are only allowed to work on that specific table.
3. Simplicity
Compared to other types of network models, a relational database model is much simpler. It is free from
query processing and complex structuring. As a result, it does not require any complex queries. A simple
SQL query is sufficient enough for handling.
4. Accessibility
Unlike other types of databases, a relational database does not require any specific path for accessing the
data. Even modifying data in the relevant column is made easy. So whatever the outcome shown is
appropriate to the user.
5. Accuracy
As mentioned earlier, relational database uses primary keys and foreign keys to make the tables
interrelated to each other. Thus, all the data which is stored is non-repetitive. Which means that the data
does not duplicate. Therefore, the data stored can be guaranteed to be accurate.
6. Multi User

Multiple users will be able to access a relational database at the same time. Even if the data is updated,
the users can access them conveniently. Hence, the crashes happening from multi access is possibly
prevented.

Disadvantages of Relational Database

1. Cost
The underlaying cost involved in a relational database is quite expensive. For setting up a relational
database, there must be separate software which needs to be purchased. And a professional technician
should be hired to maintain the system. All these can be costly, especially for businesses with small
budget.
2. Performance
Always the performance of the relational database depends on the number of tables. If there are more
number of tables, the response given to the queries will be slower. Additionally, more data presence not
39

only slows down the machine, it eventually makes it complex to find information. Thus, a relational
database is known to be a slower database.
3. Physical Storage
A relational database also requires tremendous amount of physical memory since it is with rows and
columns. Each of the operations depend on separate physical storage. Only through proper optimization,
the targeted applications can be made to have maximum physical memory.
4. Complexity
Although a relational database is free from complex structuring, occasionally it may become complex too.
When the amount of data in a relational database increases, it eventually makes the system more
complicated. Each and every data is been complex since the data is arranged using common
characteristics.
5. Information Loss
Large organizations tends to use more number of number of database systems with more tables. These
information can be used to be transferred from one system to another. This could pose a risk of data loss.
6. Structure Limitations

The fields that is present on a relational database is with limitations. Limitations in essence means that it
cannot accommodate more information. Despite if more information are provided, it may lead to data
loss. Therefore, it is necessary to describe the exact amount of data volume which the field will be given.

Anomalies of Relational Data Model.

Insertion anomaly: If a tuple is inserted in referencing relation and referencing attribute value is not present in
referenced attribute, it will not allow inserting in referencing relation. For Example, If we try to insert a record in
STUDENT_COURSE with STUD_NO =7, it will not allow.
Deletion and Updation anomaly: If a tuple is deleted or updated from referenced relation and referenced
attribute value is used by referencing attribute in referencing relation, it will not allow deleting the tuple from referenced
relation. For Example, If we try to delete a record from STUDENT with STUD_NO =1, it will not allow. To avoid this,
following can be used in query:
 ON DELETE/UPDATE SET NULL: If a tuple is deleted or updated from referenced relation and
referenced attribute value is used by referencing attribute in referencing relation, it will delete/update the tuple from
referenced relation and set the value of referencing attribute to NULL.
 ON DELETE/UPDATE CASCADE : If a tuple is deleted or updated from referenced relation and
referenced attribute value is used by referencing attribute in referencing relation, it will delete/update the tuple from
referenced relation and referencing relation as well.

Qualities of a Good Database Design

 Reflects real-world structure of the problem
 Can represent all expected data over time
 Avoids redundant storage of data items
 Provides efficient access to data
 Supports the maintenance of data integrity over time
 Clean, consistent, and easy to understand
40

Chapter-4
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It
typically exists between the primary key and non-key attribute within a table.

1. X → Y

The left side of FD is known as a determinant, the right side of the production is known
as a dependent.

For example:

Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table
because if we know the Emp_Id, we can tell that employee name associated with it.

Functional dependency can be written as:

1. Emp_Id → Emp_Name

We can say that Emp_Name is functionally dependent on Emp_Id.

Types of Functional dependency

1. Trivial functional dependency

o A → B has trivial functional dependency if B is a subset of A.

o The following dependencies are also trivial like: A → A, B → B

Example:

1. Consider a table with two columns Employee_Id and Employee_Name.

2. {Employee_id, Employee_Name} → Employee_Id is a trivial functional dependency as

3. Employee_Id is a subset of {Employee_Id, Employee_Name}.

4. Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name are trivial
dependencies too.

2. Non-trivial functional dependency

o A → B has a non-trivial functional dependency if B is not a subset of A.

o When A intersection B is NULL, then A → B is called as complete non-trivial.

Example:

1. ID → Name,
2. Name → DOB

Normalization
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate the undesirable characteristics like Insertion, Update and Deletion
Anomalies.
o Normalization divides the larger table into the smaller table and links them using
relationship.
o The normal form is used to reduce redundancy from the database table.
42

Types of Normal Forms

There are the four types of normal forms:

First Normal Form (1NF)

o A relation will be 1NF if it contains an atomic value.
o It states that an attribute of a table cannot hold multiple values. It must hold only
single-valued attribute.
o First normal form disallows the multi-valued attribute, composite attribute, and
their combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute

EMP_PHONE.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238

20 Harry 8574783832 Bihar

12 Sam 7390372389, Punjab

8589830302

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP
43

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302

Second Normal Form (2NF)

o In the 2NF, relational must be in 1NF.
o In the second normal form, all non-key attributes are fully functional dependent on
the primary key

Example: Let's assume, a school can store the data of teachers and the subjects they
teach. In a school, a teacher can teach more than one subject.

TEACHER table

TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38

83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID

which is a proper subset of a candidate key. That's why it violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables:

22.7M
429
Prime Ministers of India | List of Prime Minister of India (1947-2020)

TEACHER_DETAIL table:

TEACHER_ID TEACHER_AGE

25 30

47 35

83 38
44

TEACHER_SUBJECT table:

TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

83 Math

83 Computer

Third Normal Form (3NF)

o A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
o 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
o If there is no transitive dependency for non-prime attributes, then the relation
must be in third normal form.

A relation is in third normal form if it holds atleast one of the following conditions for
every non-trivial function dependency X → Y.

1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.

Example:

EMPLOYEE_DETAIL table:

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_

222 Harry 201010 UP Noida

333 Stephan 02228 US Boston

444 Lan 60007 US Chicag

555 Katharine 06389 UK Norwic

666 John 462007 MP Bhopa

Super key in the table above:

1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on

Candidate key: {EMP_ID}

Non-prime attributes: In the given table, all attributes except EMP_ID are non-
prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on

EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent
on super key(EMP_ID). It violates the rule of third normal form.

That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010

333 Stephan 02228

444 Lan 60007

555 Katharine 06389

666 John 462007

EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida

02228 US Boston

60007 US Chicago

06389 UK Norwich

462007 MP Bhopal

Chapter-5
46

MySQL
MySQL, the most popular Open Source SQL database management system, is developed, distributed, and
supported by Oracle Corporation.

The MySQL website ([Link] provides the latest information about MySQL software.
 MySQL is a database management system.
A database is a structured collection of data. It may be anything from a simple shopping list to a picture
gallery or the vast amounts of information in a corporate network. To add, access, and process data
stored in a computer database, you need a database management system such as MySQL Server.
Since computers are very good at handling large amounts of data, database management systems
play a central role in computing, as standalone utilities, or as parts of other applications.

 MySQL databases are relational.

A relational database stores data in separate tables rather than putting all the data in one big
storeroom. The database structures are organized into physical files optimized for speed. The logical
model, with objects such as databases, tables, views, rows, and columns, offers a flexible programming
environment. You set up rules governing the relationships between different data fields, such as one-to-
one, one-to-many, unique, required or optional, and “pointers” between different tables. The database
enforces these rules, so that with a well-designed database, your application never sees inconsistent,
duplicate, orphan, out-of-date, or missing data.
The SQL part of “MySQL” stands for “Structured Query Language”. SQL is the most common
standardized language used to access databases. Depending on your programming environment, you
might enter SQL directly (for example, to generate reports), embed SQL statements into code written in
another language, or use a language-specific API that hides the SQL syntax.
SQL is defined by the ANSI/ISO SQL Standard. The SQL standard has been evolving since 1986 and
several versions exist. In this manual, “SQL-92” refers to the standard released in
1992, “SQL:1999” refers to the standard released in 1999, and “SQL:2003” refers to the current
version of the standard. We use the phrase “the SQL standard” to mean the current version of the
SQL Standard at any time.
 MySQL software is Open Source.
Open Source means that it is possible for anyone to use and modify the software. Anybody can
download the MySQL software from the Internet and use it without paying anything. If you wish, you
may study the source code and change it to suit your needs. The MySQL software uses the GPL (GNU
General Public License), [Link] to define what you may and may not do with the
software in different situations. If you feel uncomfortable with the GPL or need to embed MySQL code
into a commercial application, you can buy a commercially licensed version from us. See the MySQL
Licensing Overview for more information ([Link]
 The MySQL Database Server is very fast, reliable, scalable, and easy to
use.
If that is what you are looking for, you should give it a try. MySQL Server can run comfortably on a
desktop or laptop, alongside your other applications, web servers, and so on, requiring little or no
attention. If you dedicate an entire machine to MySQL, you can adjust the settings to take advantage of
all the memory, CPU power, and I/O capacity available. MySQL can also scale up to clusters of
machines, networked together.

MySQL Server was originally developed to handle large databases much faster than existing solutions
and has been successfully used in highly demanding production environments for several years.
Although under constant development, MySQL Server today offers a rich and useful set of functions. Its
connectivity, speed, and security make MySQL Server highly suited for accessing databases on the
Internet.

 MySQL Server works in client/server or embedded systems.

The MySQL Database Software is a client/server system that consists of a multithreaded SQL server
that supports different back ends, several different client programs and libraries, administrative tools,
and a wide range of application programming interfaces (APIs).
47

We also provide MySQL Server as an embedded multithreaded library that you can link into your
application to get a smaller, faster, easier-to-manage standalone product.

 A large amount of contributed MySQL software is available.

MySQL Server has a practical set of features developed in close cooperation with our users. It is very
likely that your favorite application or language supports the MySQL Database Server.

The official way to pronounce “MySQL” is “My Ess Que Ell” (not “my sequel”), but we do not mind if you
pronounce it as “my sequel” or in some other localized way.

The Main Features of MySQL

This section describes some of the important characteristics of the MySQL Database Software. In most respects,
the roadmap applies to all versions of MySQL. For information about features as they are introduced into MySQL on
a series-specific basis, see the “In a Nutshell” section of the appropriate Manual:
Internals and Portability
 Written in C and C++.

 Tested with a broad range of different compilers.

 Works on many different platforms.

See [Link]
 For portability, configured using CMake.
 Tested with Purify (a commercial memory leakage detector) as well as with Valgrind, a GPL tool
([Link]
 Uses multi-layered server design with independent modules.

 Designed to be fully multithreaded using kernel threads, to easily use multiple CPUs if they are
available.

 Provides transactional and nontransactional storage engines.

 Uses very fast B-tree disk tables (MyISAM) with index compression.
 Designed to make it relatively easy to add other storage engines. This is useful if you want to provide
an SQL interface for an in-house database.

 Uses a very fast thread-based memory allocation system.

 Executes very fast joins using an optimized nested-loop join.

 Implements in-memory hash tables, which are used as temporary tables.

 Implements SQL functions using a highly optimized class library that should be as fast as possible.
Usually there is no memory allocation at all after query initialization.

 Provides the server as a separate program for use in a client/server networked environment.

Data Types
 Many data types: signed/unsigned integers 1, 2, 3, 4, and 8 bytes
long, FLOAT, DOUBLE, CHAR, VARCHAR, BINARY, VARBINARY, TEXT, BLOB, DATE, TIME, DATETIME, TIME
STAMP, YEAR, SET, ENUM, and OpenGIS spatial types
 Fixed-length and variable-length string types.

MySQL Standards Compliance

This section describes how MySQL relates to the ANSI/ISO SQL standards. MySQL Server has
many extensions to the SQL standard, and here you can find out what they are and how to use
them. You can also find information about functionality missing from MySQL Server, and how to
work around some of the differences.

The SQL standard has been evolving since 1986 and several versions exist. In this manual, “SQL-
92” refers to the standard released in 1992, “SQL:1999” refers to the standard released in 1999,
and “SQL:2003” refers to the current version of the standard. We use the phrase “the SQL
standard” or “standard SQL” to mean the current version of the SQL Standard at any time.
One of our main goals with the product is to continue to work toward compliance with the SQL
standard, but without sacrificing speed or reliability. We are not afraid to add extensions to SQL
or support for non-SQL features if this greatly increases the usability of MySQL Server for a large
segment of our user base. The HANDLER interface is an example of this strategy
We continue to support transactional and non-transactional databases to satisfy both mission-
critical 24/7 usage and heavy Web or logging usage.

MySQL Server was originally designed to work with medium-sized databases (10-100 million
rows, or about 100MB per table) on small computer systems. Today MySQL Server handles
terabyte-sized databases, but the code can also be compiled in a reduced version suitable for
hand-held and embedded devices. The compact design of the MySQL server makes development
in both directions possible without any conflicts in the source tree.

Currently, we are not targeting real-time support, although MySQL replication capabilities offer
significant functionality.

Chapter-6&7
49

MySQL Queries
A list of commonly used MySQL queries to create database, use database, create table,
insert record, update record, delete record, select record, truncate table and drop table
are given below.

1) MySQL Create Database

MySQL create database is used to create database. For example

create database db1;

2) MySQL Select/Use Database

MySQL use database is used to select database. For example

use db1;

3) MySQL Create Query

MySQL create query is used to create a table, view, procedure and function. For
example:

1. CREATE TABLE customers

2. (id int(10),
3. name varchar(50),
4. city varchar(50),
5. PRIMARY KEY (id )
6. );

4) MySQL Alter Query

MySQL alter query is used to add, modify, delete or drop colums of a table. Let's see a
query to add column in customers table:

1. ALTER TABLE customers

2. ADD age varchar(50);

5) MySQL Insert Query

MySQL insert query is used to insert records into table. For example:

1. insert into customers values(101,'rahul','delhi');

6) MySQL Update Query

MySQL update query is used to update records of a table. For example:
50

1. update customers set name='bob', city='london' where id=101;

7) MySQL Delete Query

MySQL update query is used to delete records of a table from database. For example:

1. delete from customers where id=101;

8) MySQL Select Query

Oracle select query is used to fetch records from database. For example:

1. SELECT * from customers;

9) MySQL Truncate Table Query

MySQL update query is used to truncate or remove records of a table. It doesn't remove
structure. For example:

1. truncate table customers;

10) MySQL Drop Query

MySQL drop query is used to drop a table, view or database. It removes structure and data of a table if you
drop table. For example:

1. drop table customers;

11)MySQL ORDER BY clause

The ORDER BY keyword is used to sort the result-set in ascending or descending order.
The ORDER BY keyword sorts the records in ascending order by default. To sort the records in descending order,
use the DESC keyword.

ORDER BY Syntax
SELECT column1, column2, ...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;

ORDER BY DESC Example

The following SQL statement selects all customers from the "Customers" table, sorted
DESCENDING by the "Country" column:

Example
51

SELECT * FROM Customers

ORDER BY Country DESC;

ORDER BY Several Columns Example

The following SQL statement selects all customers from the "Customers" table, sorted by the
"Country" and the "CustomerName" column. This means that it orders by Country, but if
some rows have the same Country, it orders them by CustomerName:

Example
SELECT * FROM Customers
ORDER BY Country, CustomerName;

Chapter-8

MySQL AND, OR and

NOT Operators
52

The WHERE clause can be combined with AND, OR, and NOT operators.

The AND and OR operators are used to filter records based on more than one condition:

 The AND operator displays a record if all the conditions separated by AND are TRUE.
 The OR operator displays a record if any of the conditions separated by OR is TRUE.

The NOT operator displays a record if the condition(s) is NOT TRUE.

AND Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition1 AND condition2 AND condition3 ...;

OR Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition1 OR condition2 OR condition3 ...;

NOT Syntax
SELECT column1, column2, ...
FROM table_name
WHERE NOT condition;

AND Example
The following SQL statement selects all fields from "Customers" where country is "Germany"
AND city is "Berlin":

ExampleGet your own SQL Server

SELECT * FROM Customers
WHERE Country = 'Germany' AND City = 'Berlin';

OR Example
The following SQL statement selects all fields from "Customers" where city is "Berlin" OR
"Stuttgart":

Example
SELECT * FROM Customers
WHERE City = 'Berlin' OR City = 'Stuttgart';

NOT Example
The following SQL statement selects all fields from "Customers" where country is NOT
"Germany":

Example
53

SELECT * FROM Customers

WHERE NOT Country = 'Germany';

The MySQL IN Operator

The IN operator allows you to specify multiple values in a WHERE clause.

The IN operator is a shorthand for multiple OR conditions.

IN Syntax
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);

or:

SELECT column_name(s)
FROM table_name
WHERE column_name IN (SELECT STATEMENT);

Demo Database
The table below shows the complete "Customers" table from the Northwind sample
database:

CustomerID CustomerName ContactName Address City

1 Alfreds Futterkiste Maria Anders Obere Str. 57 Berlin

2 Ana Trujillo Emparedados y Ana Trujillo Avda. de la Méxic

helados Constitución 2222 D.F.

3 Antonio Moreno Taquería Antonio Moreno Mataderos 2312 Méxic

D.F.
54

4 Around the Horn Thomas Hardy 120 Hanover Sq. Londo

5 Berglunds snabbköp Christina Berguvsvägen 8 Luleå

Berglund

6 Blauer See Delikatessen Hanna Moos Forsterstr. 57 Mann

7 Blondel père et fils Frédérique 24, place Kléber Stras

Citeaux

Example
SELECT * FROM Customers
WHERE Country IN ('Germany', 'France', 'UK');

The following SQL statement selects all customers that are NOT located in "Germany",
"France" or "UK":

Example
SELECT * FROM Customers
WHERE Country NOT IN ('Germany', 'France', 'UK');

MySQL BETWEEN Operator

BETWEEN Syntax
SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;

Demo Database
Below is a selection from the "Products" table in the Northwind sample database:

ProductID ProductName SupplierID CategoryID U

1 Chais 1 1 1

2 Chang 1 1 2

3 Aniseed Syrup 1 2 1

4 Chef Anton's Cajun Seasoning 1 2 4

5 Chef Anton's Gumbo Mix 1 2 3

BETWEEN Example
SELECT * FROM Products
WHERE Price BETWEEN 10 AND 20;

NOT BETWEEN Example

To display the products outside the range of the previous example, use NOT BETWEEN:

Example
SELECT * FROM Products
WHERE Price NOT BETWEEN 10 AND 20;

MySQL LIKE Operator

The MySQL LIKE Operator
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.

There are two wildcards often used in conjunction with the LIKE operator:

 The percent sign (%) represents zero, one, or multiple characters

 The underscore sign (_) represents one, single character

 LIKE Syntax
56

 SELECT column1, column2, ...

FROM table_name
WHERE columnN LIKE pattern;

SQL Join statement is used to combine data or rows from two or more tables based on a
common field between them. Different types of Joins are:
 INNER JOIN
 LEFT JOIN
Consider the two tables below:
Student

StudentCourse
57

The simplest Join is INNER JOIN.

1. INNER JOIN: The INNER JOIN keyword selects all rows from both the tables as long as
the condition satisfies. This keyword will create the result-set by combining all rows from
both the tables where the condition satisfies i.e value of the common field will be same.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;

table1: First table.

table2: Second table
matching_column: Column common to both the tables.

Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

Example Queries(INNER JOIN)

 This query will show the names and age of students enrolled in different courses.
58

 SELECT StudentCourse.COURSE_ID, [Link], [Link] FROM

Student
 INNER JOIN StudentCourse
 ON Student.ROLL_NO = StudentCourse.ROLL_NO;
Output:

2. LEFT JOIN: This join returns all the rows of the table on the left side of the join and
matching rows for the table on the right side of join. The rows for which there is no
matching row on right side, the result-set will contain null. LEFT JOIN is also known as
LEFT OUTER [Link]:

SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;

table1: First table.

table2: Second table
matching_column: Column common to both the tables.
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.
59

Example Queries(LEFT JOIN):

SELECT [Link],StudentCourse.COURSE_ID
FROM Student
LEFT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
60

Chapter-9

Aggregate functions in SQL




In database management an aggregate function is a function where the values of
multiple rows are grouped together as input on certain criteria to form a single
value of more significant meaning.
Various Aggregate Functions
1) Count()
2) Sum()
3) Avg()
4) Min()
5) Max()
Aggregate function with a example:
Id Name Salary
-----------------------
1 A 80
2 B 40
3 C 60
4 D 70
5 E 60
6 F Null

Count():

Count(*): Returns total number of records .i.e 6.

Count(salary): Return number of Non Null values over the column salary. i.e 5.
Count(Distinct Salary): Return number of distinct Non Null values over the
column salary .i.e 4

Sum():

sum(salary): Sum all Non Null values of Column salary i.e., 310
sum(Distinct salary): Sum of all distinct Non-Null values i.e., 250.
61

Avg():

Avg(salary) = Sum(salary) / count(salary) = 310/5

Avg(Distinct salary) = sum(Distinct salary) / Count(Distinct Salary) = 250/4

Min():

Min(salary): Minimum value in the salary column except NULL i.e., 40.
Max(salary): Maximum value in the salary i.e., 80.

SQL GROUP BY Statement

The SQL GROUP BY Statement
The GROUP BY statement groups rows that have the same values into summary rows, like
"find the number of customers in each country".

The GROUP BY statement is often used with aggregate functions

(COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.

GROUP BY Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

CustomerID CustomerName ContactName Address

1 Alfreds Futterkiste Maria Anders Obere Str. 57

2 Ana Trujillo Emparedados y helados Ana Trujillo Avda. de la Constitución 2222

3 Antonio Moreno Taquería Antonio Moreno Mataderos 2312

4 Around the Horn Thomas Hardy 120 Hanover Sq.

5 Berglunds snabbköp Christina Berglund Berguvsvägen 8

SQL GROUP BY Examples

The following SQL statement lists the number of customers in each country:

ExampleGet your own SQL Server

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;

The following SQL statement lists the number of customers in each country, sorted high to low:

Example
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
ORDER BY COUNT(CustomerID) DESC;

Demo Database
Below is a selection from the "Orders" table in the Northwind sample database:

OrderID CustomerID EmployeeID OrderDate

10248 90 5 1996-07-04

10249 81 6 1996-07-05
63

10250 34 4 1996-07-08

And a selection from the "Shippers" table:

ShipperID ShipperName

1 Speedy Express

2 United Package

3 Federal Shipping

Test Yourself With Exercises

Exercise:
List the number of customers in each country.

SELECT (CustomerID),
Country
FROM Customers
;
64

The SQL HAVING Clause

The HAVING clause was added to SQL because the WHERE keyword cannot be used with
aggregate functions.

HAVING Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

CustomerID CustomerName ContactName Address

1 Alfreds Futterkiste Maria Anders Obere Str. 57

2 Ana Trujillo Emparedados y helados Ana Trujillo Avda. de la Constitución 2222

3 Antonio Moreno Taquería Antonio Moreno Mataderos 2312

4 Around the Horn Thomas Hardy 120 Hanover Sq.

5 Berglunds snabbköp Christina Berglund Berguvsvägen 8

SQL HAVING Examples

The following SQL statement lists the number of customers in each country. Only include
countries with more than 5 customers:

ExampleGet your own SQL Server

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5;

The following SQL statement lists the number of customers in each country, sorted high to
low (Only include countries with more than 5 customers):

Example
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5
ORDER BY COUNT(CustomerID) DESC;

SQL UNION Operator

The SQL UNION Operator

The UNION operator is used to combine the result-set of two or more SELECT statements.

 Every SELECT statement within UNION must have the same number of columns
 The columns must also have similar data types
 The columns in every SELECT statement must also be in the same order

UNION Syntax
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;

UNION ALL Syntax

The UNION operator selects only distinct values by default. To allow duplicate values,
use UNION ALL:
66

SELECT column_name(s) FROM table1

UNION ALL
SELECT column_name(s) FROM table2;

Note: The column names in the result-set are usually equal to the column names in the
first SELECT statement.

Demo Database
In this tutorial we will use the well-known Northwind sample database.

Below is a selection from the "Customers" table:

CustomerID CustomerName ContactName Address

1 Alfreds Futterkiste Maria Anders Obere Str. 57

2 Ana Trujillo Emparedados y helados Ana Trujillo Avda. de la Constitución 2222

3 Antonio Moreno Taquería Antonio Moreno Mataderos 2312

And a selection from the "Suppliers" table:

SupplierID SupplierName ContactName Address City

1 Exotic Liquid Charlotte Cooper 49 Gilbert St. London

2 New Orleans Cajun Delights Shelley Burke P.O. Box 78934 New O

3 Grandma Kelly's Homestead Regina Murphy 707 Oxford Rd. Ann Ar

SQL UNION Example

The following SQL statement returns the cities (only distinct values) from both the
"Customers" and the "Suppliers" table:

ExampleGet your own SQL Server

SELECT City FROM Customers
UNION
SELECT City FROM Suppliers
ORDER BY City;

SQL: MINUS Operator

Description
The SQL MINUS operator is used to return all rows in the first SELECT statement that are not
returned by the second SELECT statement. Each SELECT statement will define a dataset. The
MINUS operator will retrieve all records from the first dataset and then remove from the results all
records from the second dataset.

Minus Query

Explanation: The MINUS query will return the records in the blue shaded area. These are the
records that exist in Dataset1 and not in Dataset2.
Each SELECT statement within the MINUS query must have the same number of fields in the result
sets with similar data types.
TIP: The MINUS operator is not supported in all SQL databases. It can used in databases such as
Oracle.
For databases such as SQL Server, PostgreSQL, and SQLite, use the EXCEPT operator to perform
this type of query.
68

Syntax
The syntax for the MINUS operator in SQL is:

SELECT expression1, expression2, ... expression_n

FROM tables
[WHERE conditions]
MINUS
SELECT expression1, expression2, ... expression_n
FROM tables
[WHERE conditions];

Parameters or Arguments
expression1, expression2, expression_n
The columns or calculations that you wish to retrieve.
tables
The tables that you wish to retrieve records from. There must be at least one table listed in the
FROM clause.
WHERE conditions
Optional. These are conditions that must be met for the records to be selected.

Note
 There must be same number of expressions in both SELECT statements.
 The corresponding expressions must have the same data type in the SELECT statements. For
example: expression1 must be the same data type in both the first and second SELECT
statement.

Example - With Single Expression

The following is a SQL MINUS operator example that has one field with the same data type:

SELECT supplier_id
FROM suppliers
MINUS
SELECT supplier_id
FROM orders;

This SQL MINUS example returns all supplier_id values that are in the suppliers table and not in the
orders table. What this means is that if a supplier_id value existed in the suppliers table and also
existed in the orders table, the supplier_id value would not appear in this result set.
69

SQL: INTERSECT Operator

Description
The SQL INTERSECT operator is used to return the results of 2 or more SELECT statements.
However, it only returns the rows selected by all queries or data sets. If a record exists in one query
and not in the other, it will be omitted from the INTERSECT results.

Intersect Query

Explanation: The INTERSECT query will return the records in the blue shaded area. These are the
records that exist in both Dataset1 and Dataset2.
Each SQL statement within the SQL INTERSECT must have the same number of fields in the result
sets with similar data types.

Syntax
The syntax for the INTERSECT operator in SQL is:

SELECT expression1, expression2, ... expression_n

FROM tables
[WHERE conditions]
INTERSECT
SELECT expression1, expression2, ... expression_n
FROM tables
[WHERE conditions];

Optional. These are conditions that must be met for the records to be selected.

Example - With Single Expression

The following is a SQL INTERSECT operator example that has one field with the same data type:

SELECT supplier_id
FROM suppliers
INTERSECT
SELECT supplier_id
FROM orders;

In this SQL INTERSECT example, if a supplier_id appeared in both the suppliers and orders table, it
would appear in your result set.
Now, let's complicate our example further by adding WHERE conditions to the INTERSECT query.

SELECT supplier_id
FROM suppliers
WHERE supplier_id > 78
INTERSECT
SELECT supplier_id
FROM orders
WHERE quantity <> 0;

In this example, the WHERE clauses have been added to each of the datasets. The first dataset has
been filtered so that only records from the suppliers table where the supplier_id is greater than 78 are
returned. The second dataset has been filtered so that only records from the orders table are
returned where the quantity is not equal to 0.

Example - With Multiple Expressions

Next, let's look at an example of how to use the INTERSECT operator in SQL to return more than one
column.
For example:
71

SELECT contact_id, last_name, first_name

FROM contacts
WHERE last_name <> 'Anderson'
INTERSECT
SELECT customer_id, last_name, first_name
FROM customers
WHERE customer_id < 50;

In this INTERSECT example, the query will return the records from the contacts table where
the contact_id, last_name, and first_name values match the customer_id, last_name,
and first_name value from the customers table.
There are WHERE conditions on each data set to further filter the results so that only records from
the contacts are returned where the last_name is not Anderson. The records from
the customers table are returned where the customer_id is less than 50.
72

Chapter-10

The SQL EXISTS Operator

The EXISTS operator is used to test for the existence of any record in a subquery.

The EXISTS operator returns TRUE if the subquery returns one or more records.

EXISTS Syntax
SELECT column_name(s)
FROM table_name
WHERE EXISTS
(SELECT column_name FROM table_name WHERE condition);

Demo Database
Below is a selection from the "Products" table in the Northwind sample database:

ProductID ProductName SupplierID Cate

1 Chais 1 1

2 Chang 1 1

3 Aniseed Syrup 1 2

4 Chef Anton's Cajun Seasoning 2 2

5 Chef Anton's Gumbo Mix 2 2

And a selection from the "Suppliers" table:

SupplierID SupplierName ContactName Address City

1 Exotic Liquid Charlotte Cooper 49 Gilbert St. Lond

2 New Orleans Cajun Delights Shelley Burke P.O. Box 78934 New

3 Grandma Kelly's Homestead Regina Murphy 707 Oxford Rd. Ann

4 Tokyo Traders Yoshi Nagase 9-8 Sekimai Musashino-shi Toky

SQL EXISTS Examples

ExampleGet your own SQL Server

SELECT SupplierName
FROM Suppliers
WHERE EXISTS (SELECT ProductName FROM Products WHERE [Link] =
[Link] AND Price < 20);

The following SQL statement returns TRUE and lists the suppliers with a product price equal
to 22:

Example
SELECT SupplierName
FROM Suppliers
WHERE EXISTS (SELECT ProductName FROM Products WHERE [Link] =
[Link] AND Price = 22);
Output:
74

SupplierName

New Orleans Cajun Delights

SQL NOT EXISTS Operator

The SQL NOT EXISTS Operator will act quite opposite to EXISTS Operator. It is used to restrict the number
of rows returned by the SELECT Statement.

The NOT EXISTS in SQL Server will check the Subquery for rows existence, and if there are no rows then it
will return TRUE, otherwise FALSE. Or we can simply say, SQL Server Not Exists operator will return the
results exactly opposite to the result returned by the Subquery.
Before going into this example, I suggest you to refer the SQL Subquery article to understand the subquery
designing and query parsing.

SQL NOT EXISTS Syntax

The basic syntax of the NOT EXISTS in SQL Server can be written as:

SELECT [Column Names]

FROM [Source]

WHERE NOT EXISTS (Write Subquery to Check)

 Columns: It allows us to choose the number of columns from the tables. It may be One or more.
 Source: One or more tables present in the Database. SQL JOINS are used to join multiple tables.
 Subquery: Here we have to provide the Subquery. If the subquery returns true then it will return
the records otherwise, it doesn’t return any records.

In this article, we will show you, How to use the SQL Server NOT EXISTS Operator
with examples. For this, We are going to use the below-shown data
75

SQL NOT EXISTS Example 1

The following query will find all the Employees present in the Employees table
whose [Sales] is less than 1000

-- SQL Server NOT EXISTS Example

USE [SQL Tutorial]

SELECT Employ1.[EmpID]

,Employ1.[FirstName] + ' ' + Employ1.[LastName] AS [Full Name]

,Employ1.[Education]

,Employ1.[Occupation]

,Employ1.[YearlyIncome]

,Employ1.[Sales]

,Employ1.[HireDate]
76

FROM [Employee] AS Employ1

WHERE NOT EXISTS( SELECT * FROM [Employee] AS Employ2

WHERE Employ1.[EmpID] = Employ2.[EmpID]

AND [Sales] > 1000

OUTPUT
77

Let me change the Not Exists condition as Sales < 10000, it means subquery will
return all the available rows. And the NOT EXISTS will return zero records because
it will return the exact opposite result of the subquery.
-- SQL Server NOT EXISTS Example

USE [SQL Tutorial]

SELECT Employ1.[EmpID]

,Employ1.[FirstName] + ' ' + Employ1.[LastName] AS [Full Name]

,Employ1.[Education]

,Employ1.[Occupation]

,Employ1.[YearlyIncome]

,Employ1.[Sales]

,Employ1.[HireDate]

FROM [Employee] AS Employ1

WHERE NOT EXISTS( SELECT * FROM [Employee] AS Employ2

WHERE Employ1.[EmpID] = Employ2.[EmpID]

AND [Sales] < 10000

OUTPUT
78

Use ANY operator to view the Table's data

The following query uses ANY operator with an Equal comparison operator:

1. SELECT * FROM Teacher_Info WHERE Teacher_Id = ANY (SELECT Head_Id from

Department_Info);

This query shows the details of a teacher from the Teacher_Info table. Here, the teacher
is also the head of the department from the Department_Info table.

The output of the above SELECT query with Equal operatoris shown in the
below table:

Teach Teacher_Fir Teacher_La Teacher_ Teacher_ Teache Teacher_

er_Id st_Name st_Name Dept_Id Address r_City Salary

1005 Shivani Singhania 4001 501 Street Kolkata 42000

1007 Shyam Besas 4003 202 Street Lucknow 35000

The following query uses ANY operator with less than operator and GROUP BY
clause:
79

1. SELECT * FROM Teacher_Info WHERE Teacher_Salary < ANY (SELECT AVG ( Tea
cher_Salary ) from Teacher_Info GROUP BY Teacher_Dept_Id );

This query shows the details of all teachers whose salaries are less than the average
salary of every department.

The output of the above SELECT query with less than operator is shown in the
below table:

Teach Teacher_Firs Teacher_Las Teacher_D Teacher_A Teacher Teacher_

er_Id t_Name t_Name ept_Id ddress _City Salary

1001 Arush Sharma 4001 22 Street New 20000

Delhi
1006 Avinash Sharma 4002 12 Street Delhi 28000

1007 Shyam Besas 4003 202 Street Lucknow 35000

Any in SQL
The ALL is an operator in SQL. This operator compares the single record to every record of the
the inner query.

The syntax for using ALL operator in Structured Query Language:

1. SELECT Column_Name_1, Column_Name_2, Column_Name_3, ……, Column_Name_N FRO

RE [condition]);

In the ALL syntax, the ALL operator is followed by the SQL comparison operator, which helps com

We can use the following comparison operators with the ALL operator in the statements of SQL:

1. Equal operator (=)

This comparison operator with ALL operator evaluates to TRUE when the value of specified colum

Syntax: