Database and File System Concepts
Database and File System Concepts
file concept: A file is named collection of related information. A file is defined as “a collection The DBMS design depends upon its architecture. The basic client/server architecture is used
of related sequence of records”. A file is a data structure, and the term file structure relates to to deal with a large number of PCs, web servers, database servers and other components
the way that a specific file might be modelled. that are connected with networks. The client/server architecture consists of many PCs and a
Database concept: A database is a collection of information that is organized so that it can workstation which are connected via the network. DBMS architecture depends upon how
easily be accessed, managed, and updated.
users are connected to the database to get their request done.
In simple word, a database is a collection of data. A database management system is a
1-Tier Architecture: In this architecture, the database is directly available to the user. It
collection of relevant data and a group of programs to access those data.
means the user can directly sit on the DBMS and uses it.
Data Model
2-Tier Architecture: The 2-Tier architecture is same as basic client-server. In the two-tier
Data models define how the logical structure of a database is modelled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is architecture, applications on the client end can directly communicate with the database at
connected to each other and how they are processed and stored inside the system. the server side. For this interaction, API's like: ODBC, JDBC are used.
The very first data model could be flat data-models, where all the data used are to be kept in 3-Tier Architecture: The 3-Tier architecture contains another layer between the client and
the same plane. Earlier data models were not so scientific server. In this architecture, client can't directly communicate with the server.
Entity-Relationship Model:-Entity-Relationship (ER) Model is based on the notion of real-world
entities and relationships among them. While formulating real-world scenario into the
database model
Relational Model: The most popular data model in DBMS is the Relational Model. It is more
scientific a model than others. This model is based on first-order predicate logic and defines a
table as an n-ray relation.
Database Administrator
The people responsible for managing databases are called Database Administrator(DBA).DBA is
the person who have complete control over database of any enterprises or any organization
DBA consists a team of people rather than just one person.
Some of the main responsibilities/functions of DBA are:
[Link] the performance [Link] user view [Link] constraints [Link]
authorities [Link] [Link] 7. Security [Link] of dump and maintain free space
Data independency
Database User The users of a database system can be classified into various categories Data independence is the ability to modify the scheme without affecting the programs and
depending upon their interaction and degree of expertise of the DBMS. Some of them are the application to be rewritten. Data is separated from the programs, so that the changes
described as follows: made to the data will not affect the program execution and the application.
[Link] Programmer: These are the computer professionals. They are responsible for We know the main purpose of the three levels of data abstraction is to achieve data
developing application programs. independence. If the database changes and expands over time, it is very important that the
[Link] Users (On-line Users): These people are also computer professional but they changes in one level should not affect the data at other levels of the database.
do not write programs. To interact with the system, they use query languages like SQL levels of data independence based on three levels of abstraction. These are as follows
[Link] Designers: This type of users are the software professional which responsible for [Link] Data Independence
creating database-oriented applications. [Link] Data Independence
[Link] Users: They are sophisticated users. They write specialized database application Aggregation is a process that represent a relationship between a whole object and its
which do not fit into traditional datagram work. component parts. It abstracts a relationship between objects and viewing the relationship
[Link] Users (Un-experienced Users): There are the users who are not aware of the presence as an object. It is a process when two entity is treated as a single entity. Aggregation is
of database system or any other system. These users are called as unsophisticated defined as, "the process of extracting relationship between the objects and seeing the
[Link] Users: These users are the people whose jobs require access to the database for abstracted relationship as an object”. In other words, aggregation allows us to treat a
querying, updating, generating reports etc. relationship set as an entity set, for purpose of participating with other relationship. Used
when we have to model a relationship involving (entity sets and) a relationship set.
Schema Basic concept KEYS: - Keys play an important role in the relational database. It is ER Diagram stands for Entity Relationship Diagram, also known as ERD is a diagram that displays
A database schema is the skeleton structure that represents the logical view of the entire used to uniquely identify any record or row of data from the table. It is also used to the relationship of entity sets stored in a database. In other words, ER diagrams help to explain the
database. It defines how the data is organized and how the relations among them are establish and identify relationships between tables. logical structure of databases. ER diagrams are created based on three basic concepts: entities,
associated. It formulates all the constraints that are to be applied on the data. attributes and [Link] Diagrams contain different symbols that use rectangles to represent
A database schema defines its entities and the relationship among them. It contains a
Types of keys: - entities, ovals to define attributes and diamond shapes to represent relationships. At first look, an ER
1. Primary Key: -The primary key refers to a column or a set of columns of a table that
descriptive detail of the database diagram looks very similar to the flowchart. However, ER Diagram includes many specialized
A database schema can be divided broadly into two categories − helps us identify all the records uniquely present in that table. A table can consist of just symbols, and its meanings make this model unique. The purpose of ER Diagram is to represent the
one primary key. Also, this primary key cannot consist of the same values
Physical Database Schema − This schema pertains to the actual storage of data and its form of entity framework [Link] Model stands for Entity Relationship Model is a high-level
storage like files, indices, etc. It defines how the data will be stored in a secondary storage. reappearing/repeating for any of its rows. conceptual data model diagram. ER model helps to systematically analyze data requirements to
Logical Database Schema − This schema defines all the logical constraints that need to be 2 Super Key: -A super key refers to the set of all those keys that help us uniquely identify all produce a well-designed database. The ER Model represents real-world entities and the
applied on the data stored. It defines tables, views, and integrity constraints the rows present in a table. It means that all of these columns present in a table that can relationships between them. Creating an ER Model in DBMS is considered as a best practice before
identify the columns of that table uniquely act as the super keys. implementing your [Link] Modeling helps you to analyze data requirements systematically to
[Link] Key: -The candidate keys refer to those attributes that identify rows uniquely produce a well-designed database. So, it is considered a best practice to complete ER modeling
in a table. In a table, we select the primary key from a candidate key. Thus, a candidate key
before implementing your database.
has similar properties as that of the primary keys that we have explained above. In a table,
there can be multiple candidate keys.
Strong and Weak Entity Sets: -There are two types of entity sets as explained below
4. Alternate Key: -As we have stated above, any table can consist of multiple choices for 1. An entity set that does not have any key attribute of its own is called a weak entity set.
the primary key. But it can only choose one. Thus, all those keys that did not become a 2. An entity set that has a key attribute is called a strong entity set.
primary key are known as alternate keys. The weak entity is also called a dependent entity as it depends on another entity for its
identification.
5. Foreign Key: -We use a foreign key to establish relationships between two available
tables. The foreign key would require every value present in a column/set of columns to The strong entity is called an independent entity, as it does not relay on another entity for its
match the referential table’s primary key. identification.
[Link] Key: -The composite key refers to a set of multiple attributes that help us Difference between Strong Entity Set and Weak Entity Set:
uniquely identify every tuple present in a table. The attributes present in a set may not be Strong Entity Set
unique whenever we consider them separately. [Link] entity set that has a key attribute is called as strong entity set.
7. Unique Key: -A unique key refers to a column/a set of columns that identify every record 2.A member of a strong entity set is called dominant entity.
[Link] is represented by a rectangle.
uniquely in a table. All the values in this key would have to be unique. Remember that a
unique key is different from a primary key. It is because it is only capable of having one null [Link] contains a primary key represented by an underline.
value Weak entity set
[Link] entity set which does not have any key attribute of its own is called as weak entity set.
2. A member of weak entity set is called as subordinate entity.
Codd’s 12 rules Integrity Constraints Relational Calculus:- It is a formal declarative query language. In relational algebra, we have to
Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems, Integrity constraints are a set of rules. It is used to maintain the quality of information. specify what data to be retrieve and also how to retrieve it, but in relational calculus we need to any
came up with twelve rules of his own, which according to him, a database must obey in order Integrity constraints ensure that the data insertion, updating, and other processes have to specify what data needs to be retrieved without specified how to retrieve it. There is an alternate
to be regarded as a true relational database. be performed in such a way that data integrity is not affected. Thus, integrity constraint is way of formulating queries known as Relational Calculus. Relational calculus is a non-procedural
These rules can be applied on any database system that manages stored data using only its used to guard against accidental damage to the database. query language. In the non-procedural query language, the user is concerned with the details of how
relational capabilities. This is a foundation rule, which acts as a base for all the other rules. Types of Integrity Constraint to obtain the end results. The relational calculus tells what to do but never explains how to do. Most
Rule 1 Information Rule: -The data stored in a database, may it be user data or metadata, 1. Domain constraints:- Domain constraints can be defined as the definition of a valid set commercial relational languages are based on aspects of relational calculus including SQL-QBE and
must be a value of some table cell. Everything in a database must be stored in a table format. of values for an attribute. The data type of domain includes string, character, integer, time, QUEL.
Rule 2 Guaranteed Access Rule: -Every single data element (value) is guaranteed to be date, currency, etc. The value of the attribute must be available in the corresponding Types of Relational calculus: -
accessible logically with a combination of table-name, primary-key (row value), and attribute- domain. [Link] Relational Calculus (TRC) It is a non-procedural query language which is based on finding a
name (column value). No other means, such as pointers, can be used to access data. 2. Entity integrity constraints:-The entity integrity constraint states that primary key value number of tuple variables also known as range variable for which predicate holds true. It describes
Rule 3: Systematic Treatment of NULL Values: -The NULL values in a database must be given a can't be null. This is because the primary key value is used to identify individual rows in the desired information without giving a specific procedure for obtaining that information. The tuple
systematic and uniform treatment. This is a very important rule because a NULL can be relation and if the primary key has a null value, then we can't identify those rows. A table relational calculus is specified to select the tuples in a relation. In TRC, filtering variable uses the
interpreted as one the following − data is missing, data is not known, or data is not applicable. can contain a null value other than the primary key field. tuples of a relation. The result of the relation can have one or more tuples.
Rule 4 Active Online CatLog: -The structure description of the entire database must be stored 3. Referential Integrity Constraints:-A referential integrity constraint is specified between Notation: A Query in the tuple relational calculus is expressed as following notation
in an online catalog, known as data dictionary, which can be accessed by authorized users. two tables. In the Referential integrity constraints, if a foreign key in Table 1 refers to the {T | P (T)} or {T | Condition (T)} Where T is the resulting tuples P(T) is the condition used to fetch T
Users can use the same query language to access the catalog which they use to access the Primary Key of Table 2, then every value of the Foreign Key in Table 1 must be null or be 2. Domain Relational Calculus (DRC):-The second form of relation is known as Domain relational
database itself. available in Table 2. calculus. In domain relational calculus, filtering variable uses the domain of attributes. Domain
Rule 5 Comprehensive Data Sub-Language Rule: -A database can only be accessed using a 4. Key constraints:-Keys are the entity set that is used to identify an entity within its entity relational calculus uses the same operators as tuple calculus. It uses logical connectives ∧ (and), ∨
language having linear syntax that supports data definition, data manipulation, and transaction set uniquely. An entity set can have multiple keys, but out of which one key will be the (or) and ┓ (not). It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable. The QBE
management operations. This language can be used directly or by means of some application. primary key. A primary key can contain a unique and null value in the relational table. or Query by example is a query language related to domain relational calculus.
If the database allows access to data without any help of this language, then it is considered as Domain Constraints Notation: 1.{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)} Where a1, a2 are attributes stands for formula
a violation. Domain Constraints are user-defined columns that help the user to enter the value built by inner attributes.
Rule 6: View Updating Rule: -All the views of a database, which can theoretically be updated, according to the data type. And if it encounters a wrong input it gives the message to the Operators in Relational Algebra
must also be updatable by the system. user that the column is not fulfilled properly. Or in other words, it is an attribute that Projection (π)Projection is used to project required column data from a relation.
Rule 7: High-Level Insert, Update, and Delete Rule: -A database must support high-level specifies all the possible values that the attribute can hold like integer, character, date, Selection (σ)Selection is used to select required tuples of the relations. for the above relation (c>3)R
insertion, updating, and deletion. This must not be limited to a single row, that is, it must also
time, string, etc. It defines the domain or the set of values for an attribute and ensures that will select the tuples which have c more than 3.
support union, intersection and minus operations to yield sets of data records.
the value taken by the attribute must be an atomic value(Can’t be divided) from its domain. Union (U)Union operation in relational algebra is same as union operation in set theory, only
Rule 8: Physical Data Independence: -The data stored in a database must be independent of Introduction of Relational Algebra in DBMS constraint is for union of two relation both relations must have same set of Attributes.
the applications that access the database. Any change in the physical structure of a database Relational Algebra is procedural query language, which takes Relation as input and Set Difference (-) Set Difference in relational algebra is same set difference operation as in set
must not have any impact on how the data is being accessed by external applications. generate relation as output. Relational algebra mainly provides theoretical foundation for theory with the constraint that both relations should have same set of attributes.
Rule 9: Logical Data Independence: -The logical data in a database must be independent of its relational databases and SQL. Rename (ρ)Rename is a unary operation used for renaming attributes of a relation.
user’s view (application). Any change in logical data must not affect the applications using it.
ρ (a/b)R will rename the attribute ‘b’ of relation by ‘a’.
For example, if two tables are merged or one is split into two different tables, there should be Many to one - when one or more entries in on table maybe lined to each Natural Join (⋈) Natural join is a binary operator. Natural join between two or more relations will
no impact or change on the user application. This is one of the most difficult rules to apply. record in the other table this is known as many to one relationship this is also result set of all combination of tuples where they have equal common attribute.
Rule 10: Integrity Independence: -A database must be independent of the application that one of the most common type of religious relationship found along with one to Conditional Join Conditional join works similar to natural join. In natural join, by default condition is
uses it. All its integrity constraints can be independently modified without the need of any many relationship equal between common attribute while in conditional join we can specify the any condition such as
change in the application. This rule makes a database independent of the front-end application One to one - a one to one relationship in database management system
greater than, less than, not equal
and its interface. represents a unique connection between two tables where each record
Rule 11: Distribution Independence: -The end-user must not be able to see that the data is appears only once in both [Link] type of relationship can be seen in real-
distributed over various locations. Users should always get the impression that the data is world scenario such as an employee and their assigned workstation
located at one site only. This rule has been regarded as the foundation of distributed database One to many - When each entry in one table maybe linked to one or more
systems. records in other table this is known as a one to many relationship
Many to many - A many to many relationship exist when one or more items in
Rule 12: Non-Subversion Rule: -If a system has an interface that provides access to low-level
one table can have a relationship to one or more item in another relationship
records, then the interface must not be able to subvert the system and bypass security and
integrity constraints.
Basic structure Aggregate functions:- Group by and having clause
SQL stands for Structured Query Language(SQL). It is the database language by the use of which we can SQL aggregation function is used to perform the calculations on 1. GROUP BY:- Syntax:-
perform certain operations on the existing [Link] can use this language to create a database. SQL multiple rows of a single column of a table. It returns a single [Link] GROUP BY statement is SELECT column1, column2
uses certain commands like Create, Drop, Insert, etc. to carry out the required tasks.
These SQL commands are mainly categorized into four categories as: value. It is also used to summarize the data. used to arrange identical data FROM table_name
[Link] – Data Definition Language Types of SQL Aggregation Function into groups. The GROUP BY WHERE conditions
[Link] – Data Manipulation Language
[Link] – Data Control Language
1. COUNT FUNCTION:- 3. AVG function:- statement is used with the SQL GROUP BY column1, column2
[Link] function is used to Count The AVG function is used to SELECT statement.
[Link] – Transaction Control Language HAVING conditions
[Link] – Data Query Language the number of rows in a database calculate the average value of the BThe GROUP BY statement ORDER BY column1, column2;
[Link]:- [Link]:- table. It can work on both numeric numeric type. AVG function returns follows the WHERE clause in a 3. ORDER BY:-
.DDL stands for Data Definition Language. DCL stands for Data Control Language. and non-numeric data types. the average of all non-Null [Link] statement and precedes [Link] ORDER BY clause sorts the
.DDL consists of the SQL commands that can DCL includes commands such as GRANT
be used to define the database schema. and REVOKE which mainly deals with the [Link] function uses the Syntax:-AVG() the ORDER BY clause. result-set in ascending or
.It simply deals with descriptions of the rights, permissions and other controls of the COUNT(*) that returns the count of AVG( [ALL|DISTINCT] expression )
[Link] GROUP BY statement is descending order.
database schema and is used to create and database system. all the rows in a specified table. 4. MAX Function:-
used with aggregation function. [Link] sorts the records in ascending
modify the structure of database objects in the Examples of DCL commands: COUNT(*) considers duplicate and MAX function is used to find the
database. GRANT-gives user’s access privileges to the Syntax:- order by default. DESC keyword is
Null. maximum value of a certain
Examples of DDL commands: database. SELECT column used to sort the records in
.CREATE – is used to create the database or REVOKE-withdraw user’s access privileges Syntax:-COUNT(*) column. This function determines
FROM table_name descending order.
its objects. given by using the GRANT command COUNT( [ALL|DISTINCT] expression the largest value of all selected
.DROP – is used to delete objects from the WHERE conditions Syntax:-
.[Link]:- ) values of a column.
database. GROUP BY column SELECT column1, column2
TCL stands for Transaction Control
2. SUM Function:- Syntax:-MAX()
.ALTER-is used to alter the structure of the Language ORDER BY column FROM table_name
database. Sum function is used to calculate MAX( [ALL|DISTINCT] expression 2. )HAVING:-
TCL commands deal with the WHERE condition
.TRUNCATE–is used to remove all records transaction within the database. the sum of all selected columns. It 5. MIN Function:-
[Link] clause is used to ORDER BY column1, column2... AS
from a table. Examples of TCL commands: works on numeric fields only. MIN function is used to find the
.RENAME –is used to rename an object specify a search condition for a C|DESC;
COMMIT– commits a Transaction.
Syntax:- minimum value of a certain
existing in the database. ROLLBACK– rollbacks a transaction in case group or an aggregate.
SUM() column. This function determines
[Link]:- of any error occurs. [Link] is used in a GROUP BY
.DML stands for Data Manipulation SAVEPOINT–sets a savepoint within a SUM( [ALL|DISTINCT] expression ) the smallest value of all selected
clause. If you are not using
Language. transaction. values of a column.
GROUP BY clause then you can
.The SQL commands that deal with the SET TRANSACTION–specify characteristics Syntax:-MIN()
manipulation of data present in the database. for the transaction. use HAVING function like a
MIN( [ALL|DISTINCT] expression )
Examples of DML: 5. Data Query Language WHERE clause.
INSERT – is used to insert data into a table. DQL is used to fetch the data from the NULL Value A field with a NULL value is a field with no value.
UPDATE – is used to update existing data If a field in a table is optional, it is possible to insert a new
database.
within a table. record or update a record without adding a value to this field.
DELETE – is used to delete records from a It uses only one command:-
Then, the field will be saved with a NULL value .NULL Syntax
database table. a. SELECT: This is the same as the
SELECT column_names
SELECT – is used to retrieve data from the projection operation of relational algebra. FROM table_name
database It is used to select the attribute based on WHERE column_name IS NULL;
the condition described by WHERE
clause.
String functions Triggers Set operations:-
LTRIM(): This function is used to cut RPAD(): This function is used to make the What is PL/SQL Trigger:-A trigger is a PL/SQL block structure which is fired The SQL Set operation is used to combine the two or more SQL SELECT statements.
the given sub string from the original given string as long as the given size by when a DML statements like INSERT, DELETE, UPDATE is executed on a There are four types of Set Operation
string adding the given symbol on the right. database table. A trigger is triggered automatically when an associated DML 1. Union:- 3. Intersect:-
Syntax:-SELECT Ltrim (‘hello’); Syntax:-SELECT Rpad (‘Hello’,8,’*’); statement [Link] SQL Union operation is used to [Link] is used to combine two SELECT
LOWER(): This function is used to LENGTH(): This function is used to find the Trigger is invoked by Oracle engine automatically whenever a specified event combine the result of two or more SQL statements. The Intersect operation
convert the upper case string into lower length of a word. occurs. Trigger is stored SELECT queries. returns the common rows from both the
case Syntax:-SELECT Length (‘hello’); into database and invoked repeatedly, when specific condition match. Triggers [Link] the union operation, all the number of SELECT statements.
Syntax:-SELECT Lower(city)FROM SUBSTR(): This function is used to find a sub are stored programs, which are automatically executed or fired when some datatype and columns must be same in both [Link] the Intersect operation, the number
Employee; string from the a string from the given position.
RTRIM(): This function is used to cut Syntax:-SELECT substr (Address,1,3)FROM
events occur the tables on which UNION operation is of datatype and columns must be the
Types of Triggers: being applied. same.
the given sub string from the original Employee;
1. Row Level Trigger:-executed when a triggering statement is issued and the [Link] union operation eliminates the Syntax:SELECT column_name FROM tabl
string. CONCAT(): This function is used to add two
trigger restriction evaluates to TRUE
Syntax:-SELECT Rtrim(‘hello’) words or strings duplicate rows from its resultset. e1
Syntax:-SELECT Concat (‘my’,’s’,’ql’); A row trigger is fired each time the table is affected by the triggering
LPAD(): This function is used to make Syntax:- INTERSECT
the given string of the given size by REVERSE(): This function is used to reverse statement. An event is triggered for each row updated, inserted or deleted.
[Link] trigger:- A statement trigger is fired once on behalf of the SELECT column_name FROM table1 SELECT column_name FROM table2;
adding the given symbol. a string.
Syntax:-SELECT Lpad(‘hello’,8,8’*’); Syntax:-SELECT Reverse (mysql’); triggering statement, regardless of the number of rows in the table that the UNION 4. Minus:-
INSERT(): This function is used to insert the triggering statement affects, even if no rows are affected SELECT column_name FROM table2; [Link] combines the result of two SELECT
data into a database 3. BEFORE and AFTER Trigger:-When defining a trigger, you can specify the 2. Union All:- statements. Minus operator is used to
Syntax:-SELECT trigger timing-whether the trigger action is to be run before or after the Union All operation is equal to the Union display the rows which are present in the
Insert(‘Quadratic’,3,4,’What);’’ triggering statement. BEFORE and AFTER apply to both statement and row operation. It returns the set without first query but absent in the second qury
Project Operation:This operation shows the list of those attributes that we wish to triggers. removing duplication and sorting the data. [Link] has no duplicates and data arranged
appear in the result. Rest of the attributes are eliminated from the table. [Link] Trigger:-Combination trigger are combination of two trigger Syntax:- in ascending order by default.
It is denoted by type. (1) Before Statement Trigger: Trigger fire only once for each statement SELECT column_name FROM table1 Syntax:SELECT column_name FROM tabl
Union Operation: Suppose there are two tuples R and S. The union operation contains before the UNION ALL e1 MINUS SELECT column_name FROM
all the tuples that are either in R or S or both in R & S.
SELECT column_name FROM table2; table2;
It eliminates the duplicate tuples. It is denoted by .
Join concepts- PL/SQL Cursor:- Views in SQL:-.Views in SQL are considered as a virtual table. A view also contains
Join in DBMS is a binary operation which allows you to combine join product and When an SQL statement is processed, Oracle creates a memory area rows and [Link] create the view, we can select the fields from one or more tables
known as context area. A cursor is a pointer to this context area. It
selection in one single statement. The goal of creating a join condition is that it helps present in the database.A view can either have specific rows based on certain condition
you to combine the data from two or more DBMS tables. The tables in DBMS are contains all information needed for processing the statement. In PL/SQL,
the context area is controlled by Cursor. A cursor contains information or all the rows of a table.
associated using the primary key and foreign keys.
on a select statement and the rows of data accessed by it.A cursor is used CREATING VIEWS:-
There are mainly two types of joins in DBMS:
to referred to a program to fetch and process the rows returned by the We can create View using CREATE VIEW statement. A View can be created from a
Inner Joins: Theta, Natural, EQUI
SQL statement, one at a time. There are two types of cursors: single table or multiple tables.
Outer Join: Left, Right, Full
1) PL/SQL Implicit Cursors:-The implicit cursors are automatically Syntax:
Inner Join:-is used to return rows from both tables which satisfy the given condition. It
generated by Oracle while an SQL statement is executed, if you don't use CREATE VIEW view_name AS
is the most widely used join operation and can be considered as a default join-type
an explicit cursor for the [Link] are created by default to SELECT column1, column2.....
An Inner join or equijoin is a comparator-based join which uses equality comparisons
process the statements when DML statements like INSERT, UPDATE, FROM table_name
in the join-predicate. However, if you use other comparison operators like “>” it can’t be
DELETE etc. are executed.
WHERE condition;
called equijoin.
2) PL/SQL Explicit Cursors:-The Explicit cursors are defined by the DELETING VIEWS:-
Inner Join further divided into three subtypes:
We have learned about creating a View, but what if a created View is not needed
EQUI Join:-is done when a Theta join uses only the equivalence condition. EQUI join programmers to gain more control over the context area. These cursors
any more Obviously we will want to delete it. SQL allows us to delete an existing
is the most difficult operation to implement efficiently in an RDBMS, and one reason should be defined in the declaration section of the PL/SQL block. Syntax
View.
why RDBMS have essential performance problems. of explicit cursor:-Following is the syntax to create an explicit Syntax:
outer Join cursor: DROP VIEW view_name;
Outer Join:-doesn’t require each record in the two join tables to have a matching 1) Declare the cursor:-It defines the cursor with a name and the view_name: Name of the View which we want to delete.
record. In this type of join, the table retains each record even if no other matching
associated SELECT statement UPDATING VIEWS:-
record exists.
Three types of Outer Joins are: SYNTAX:-CURSOR name IS SELECT statement; There are certain conditions needed to be satisfied to update a view. If any one of
Left Outer Join returns all the rows from the table on the left even if no matching rows 2) Open the cursor::-It is used to allocate memory for the cursor and these conditions is not met, then we will not be allowed to update the view.
have been found in the table on the right. When no matching record is found in the make it easy to fetch the rows returned by the SQL statements into it. The SELECT statement which is used to create the view should not include GROUP
table on the right, NULL is returned. Syntax for cursor open:-OPEN cursor_name; BY clause or ORDER BY clause.
Right Outer Join returns all the columns from the table on the right even if no 3) Fetch the cursor::-It is used to access one row at a time. You can The SELECT statement should not have the DISTINCT keyword.
matching rows have been found in the table on the left. Where no matches have been fetch rows from the above-opened cursor as follows: Syntax:-CREATE OR REPLACE VIEW view_name AS
found in the table on the left, NULL is returned. RIGHT outer JOIN is the opposite of Syntax for cursor fetch:-ETCH cursor_name INTO variable_list; SELECT column1,coulmn2,..
LEFT JOIN FROM table_name WHERE condition;
4) Close the cursor::-It is used to release the allocated memory. The
Full Outer Join , all tuples from both relations are included in the result, irrespective of following syntax is used to close the above-opened cursors.
the matching condition Syntax for cursor close:-Close cursor_name;
Functions 1NF, 2NF, 3NF, 5NF,4NF BCNF Boyce Codd normal form (BCNF):-BCNF is the advance version of 3NF. It is
.A function is a logically grouped set of SQL and PL/SQL statements that perform A First Normal Form (1NF):- stricter than 3NF.A table is in BCNF if every functional dependency X → Y, X is the
A relation will be 1NF if it contains an atomic value. It states that an attribute of
function is made up of a declarative part, an executable part and an optional a table cannot hold multiple values. It must hold only single-valued attribute.
super key of the [Link] BCNF, the table should be in 3NF, and for every FD, LHS is
exception handling. A declaration part consists of declarations of variables. An First normal form disallows the multi-valued attribute, composite attribute, and super key.
executable part consists of the logic i.e. SQL statements and exception handling part their combinations. [Link] JOIN :-The SQL SELF JOIN is used to join a table to itself as if the table were two
handles any error during run-time. A function is a subprogram that returns a single Second Normal Form (2NF):- tables; temporarily renaming at least one table in the SQL [Link]
value. Functions is a standalone PL/SQL subprogram. Like PL/SQL procedure, functions In the 2NF, relational must be in [Link] the second normal form, all non-key The basic syntax of SELF JOIN is as follows −
attributes are fully functional dependent on the primary key SELECT a.column_name, b.column_name...
have a unique name. The significant difference is that a function is a PL/SQL block that Third Normal Form (3NF):- FROM table1 a, table1 b
returns a single value. Functions can accept one, many, or no parameters, but a A relation will be in 3NF if it is in 2NF and not contain any transitive partial WHERE a.common_field = b.common_field;
function must have a return clause in the executable section dependency.3NF is used to reduce the data duplication. It is also used to Transaction concept
of the [Link] advantages of functions are listed below: achieve the data integrity. If there is no transitive dependency for non-prime The transaction is a set of logically related operation. It contains a group of tasks.
.Functions are a standalone block that is mainly used for calculation purpose. attributes, then the relation must be in third normal form. A relation is in third A transaction is an action or series of actions. It is performed by a single user to perform operations for
normal form if it holds at least one of the following conditions for every non- accessing the contents of the database.
The data type of the return value must be declared in the header of the function. trivial function dependency X → Y Operations of Transaction:
.A function has output that needs to be assigned to a variable, or it can be used in a Fourth normal form (4NF):- Following are the main operations of transaction:
SELECT statement. A relation will be in 4NF if it is in Boyce Codd normal form and has no multi- Read(X): Read operation is used to read the value of X from the database and stores it in a buffer in
Non-Equi Join NON EQUI JOIN performs a JOIN using comparison operator other than valued dependency .For a dependency A → B, if for a single value of A, multiple main memory.
equal(=) sign like >, <, >=, <= with [Link]: values of B exists, then the relation will be a multi-valued dependency. Write(X): Write operation is used to write the value back to the database from the buffer.
Fifth normal form (5NF):- The functional dependency is a relationship that exists between two attributes. It typically exists
SELECT *
A relation is in 5NF if it is in 4NF and not contains any join dependency and between the primary key and non-key attribute within a table.
FROM table_name1, table_name2 joining should be lossless.5NF is satisfied when all the tables are broken into as X → Y
WHERE table_name1.column [> | < | >= | <= ] table_name2.column; many tables as possible in order to avoid redundancy. The left side of FD is known as a determinant, the right side of the production is known as a dependent.
What is Partial Dependency? What is Normalization? States of Transaction
Partial Dependency occurs when a non-prime attribute is functionally dependent on Normalization is the process of organizing the data in the database. States through which a transaction goes during its lifetime. These are the
part of a candidate key. Normalization is used to minimize the redundancy from a relation or set of relations. states which tell about the current state of the Transaction and also tell how
Trivial Functional Dependency It is also used to eliminate undesirable characteristics like Insertion, Update, and we will further do the processing in the transactions. These states govern the
In Trivial Functional Dependency, a dependent is always a subset of the determinant. Deletion Anomalies. rules which decide the fate of the transaction whether it will commit or abort.
Non-trivial Functional Dependency Normalization divides the larger table into smaller and links them using relationships. These are different types of Transaction States :
In Non-trivial functional dependency, the dependent is strictly not a subset of the The normal form is used to reduce redundancy from the database table. [Link] State – When the instructions of the transaction are running then
determinant. Prime attributes:- the transaction is in active state. If all the ‘read and write’ operations are
performed without any error then it goes to the “partially committed state”; if
Multivalued Functional Dependency Attributes of the database tables which are candidate keys of the database tables are
any instruction fails, it goes to the “failed state”.
In Multivalued functional dependency, entities of the dependent set are not called prime attributes.
[Link] Committed – After completion of all the read and write operation
dependent on each other. Non-prime attributes:- the changes are made in main memory or local buffer. If the changes are
i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is Attributes of the database tables which do not exist in any of the possible candidate made permanent on the DataBase
called a multivalued functional dependency. keys of the database tables are called non-prime attributes. [Link] State – When any instruction of the transaction fails, it goes to the
“failed state” or if failure occurs in making a permanent change of data on
Data Base.
[Link] State – After having any type of failure the transaction goes from
“failed state” to “aborted state” and since in previous states
ACID properties Lock base protocols Two-Phase Locking Protocol
.It is important to ensure that the database remains consistent before and after the In this type of protocol, any transaction cannot read or write data until it acquires an The Two-Phase Locking Protocol, often known as the 2PL protocol, is a method of
transaction. appropriate lock on it. There are two types of lock: concurrency control in DBMS that maintains serializability by securing transaction
.To ensure the consistency of database, certain properties are followed by all the 1. Shared lock:-It is also known as a Read-only lock. In a shared lock, the data item data with a lock that prevents subsequent transactions from accessing the same data
transactions occurring in the system. can only read by the [Link] can be shared between the transactions because at the same time. The Two-Phase Locking strategy aids in the elimination of the DBMS
.These properties are called as ACID Properties of a transaction. when the transaction holds a lock, then it can't update the data on the data item. concurrency problem. It guarantees conflict serializability in turn guarantees
1. Atomicity-This property ensures that either the transaction occurs completely or it 2. Exclusive lock:-In the exclusive lock, the data item can be both reads as well as serializability.
does not occur at all. written by the [Link] lock is exclusive, and in this lock, multiple transactions Deadlock Detection and Recovery Schemes
In other words, it ensures that no transaction occurs partially. do not modify the same data simultaneously. This is one method for dealing with deadlock and it allows the system to enter a
That is why, it is also referred to as “All or nothing rule“. There are four types of lock protocols available: deadlock state, and then try to recover using deadlock detection and deadlock
It is the responsibility of Transaction Control Manager to ensure atomicity of the 1. Simplistic lock protocol recovery scheme. If the probability that system enters deadlock state is relatively low,
transactions. It is the simplest way of locking the data while transaction. Simplistic lock-based this method is efficient.
2. Consistency-This property ensures that integrity constraints are maintained. protocols allow all the transactions to get the lock on the data before insert or delete 1. Deadlock Detection:
In other words, it ensures that the database remains consistent before and after the or update on it. Deadlock detection is periodic check by the DBMS to determine if the waiting line for
transaction. 2. Pre-claiming Lock Protocol some resource exceeds a predetermined limit.
It is the responsibility of DBMS and application programmer to ensure consistency of .Pre-claiming Lock Protocols evaluate the transaction to list all the data items on Deadlocks can be detected using a directed graph called wait for graph.
the database. which they need locks. The wait for graph consists of a pair G = (V,E) where V is a set of vertices and E is a set
3. Isolation-This property ensures that multiple transactions can occur simultaneously .Before initiating an execution of the transaction, it requests DBMS for all the lock on of edges. The set of vertices consists of all transactions in the system. Each element in
without causing any inconsistency. all those data items. the set E of edges is an ordered pair
During execution, each transaction feels as if it is getting executed alone in the 3. Two-phase locking (2PL) 2. Recovery From Deadlock:
system. .The two-phase locking protocol divides the execution phase of the transaction into The most common solution to recover from deadlock is to rollback one or more
4. Durability-This property ensures that all the changes made by a transaction after three parts. transactions to break the deadlock.
its successful execution are written successfully to the disk. .In the first part, when the execution of the transaction starts, it seeks permission for . A transaction can be recovered using following three actions:
It also ensures that these changes exist permanently and are never lost even if there the lock it requires.
occurs a failure of any kind. 4. Strict Two-phase locking (Strict-2PL)
It is the responsibility of recovery manager to ensure durability in the database. .The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the
locks, the transaction continues to execute normally.
Conflict & View serializability DATA FRAGMENTATION Replication and Allocation techniques
Conflict Serializability-If a given non-serial schedule can be converted into a In distribution database, the technique of breaking (splitting or dividing) the database REPLICATION
serial schedule by swapping its non-conflicting operations, then it is called as into logical units, which may be assigned for storage at the various sites, is called data Data replication is the process of storing separate copies (replicas) of the database at
a conflict serializable schedule. fragmentation. two or more sites. It is a popular fault tolerance technique of distributed databases.
Conflicting Operations- In the data fragmentation, a relation (table) can be partitioned (or fragmented) into Types of Data Replication:
Two operations are called as conflicting operations if all the following several fragments (pieces/parts), for physical storage purposes and there may be 1. Synchronous Replication: In synchronous replication, the replica will be modified
conditions hold true for them- several replicas of each fragment. These fragments contain sufficient information to immediately after some changes are made in the relation table. So there is no
.Both the operations belong to different transactions allow reconstruction of the original relation. difference between original data and replica.
.Both the operations are on the same data item
Types of Fragmentation: 2. Asynchronous Replication: In asynchronous replication, the replica will be modified
.At least one of the two operations is a write operation
1. Horizontal Fragmentation:-A horizontal fragment of a table is a subset of the after commit is fired on to the database.
View Serializability- If a given schedule is found to be view equivalent to
tuples (rows) with all attributes in that relation. DATA ALLOCATION
some serial schedule, then it is called as a view serializable schedule.
A schedule is view serializable, if it is view equivalent to some serial schedule. Two Horizontal fragmentation splits (divides) the relation 'horizontally' by assigning each In distributed database data allocation involves making decisions regarding what
schedules would be tuple or group (sbuset) of tuples of a relation to one or more fragments, where each fragments will be allocated to which nodes and how much replication will be carried
view equivalence if the transactions in both the schedules perform similar actions in a tuple or a subset has a certain logical meaning and these fragments can then be out. These decisions must be based on the requirements of the system including the
similar manner. View serializability can be defined as, "a concurrent execution of n assigned to different sites in the distributed system. query profiles, response time required and the level of reliability needed.
transactions can be defined as, T. Ta ... T (call this execution S) is called view 2. Vertical Fragmentation:-Vertical fragmentation splits (divides) the relation by 1. Partitioned or Fragmented Strategy: In this strategy database is divided into several
serializable if the execution say is view equivalent to a serial execution of the n decomposing 'vertically' by columns (attributes) and a fragment of a relation keeps disjoint fragments and stored at several sites.
transaction". only certain attributes of the relation at a particular site, because each site may not 2. Replication Strategy: In this strategy copies (replicas) of one or more database
need all the attributes of a relation. fragments are stored at several sites.
3. Hybrid or Mixed Fragmentation:-In hybrid fragmentation, a combination of 3. Centralized Strategy: In this strategy entire single database and the DBMS is stored
horizontal and vertical fragmentation techniques are used. at one site. However, users are geographically distributed across the computer
Horizontal or vertical fragmentaiton of a relation (table), follwed by further vertical or network.
horizontal
Introduction to Client-Server Architecture Procedural language Client-Server Architecture for DDBMS
.As the personal computers became faster, more powerful, and cheaper, the In procedural languages, the program code is written as a sequence .The architecture of a system reflects the structure of the underlying system. It
database system started to exploit the available process power of the system at the of instructions. User has to specify “what to do” and also “how to do” defines the different components of the system, the functions of these components
user's side, which led to the development of client-server architecture. (step by step procedure). These instructions are executed in the and the overall interactions and rela tionships between these components.
.The architecture of a system reflects the structure of the underlying system. It sequential order. These instructions are written to solve specific .A distributed database system allows applications to access data from local and
defines the different components of the system, the functions of these components problems.
remote databases. Distributed database systems use client-server architecture to
and the overall interactions and rela tionships between these components. Non-Procedural Language:
process information requests.
.A distributed database system allows applications to access data from local and In the non-procedural languages, the user has to specify only “what
to do” and not “how to do”. It is also known as an applicative or .The computers in distributed system may vary in size and function, ranging from
remote databases. Distributed database systems use client-server architecture to workstations up to mainframe systems. The computers in a distributed database
functional language. It involves the development of the functions from
process information requests. other functions to construct more complex functions system are referred to by a number of different names, such as sites or nodes.
1. Back-end manages access structures, query evaluation and optimization . 1. A Client is an individual computer or process or user's application that requests
concurrency control and recovery. algebra operation services from the server. A client is also known as front-end application, as the end
2. Front-end consists of tools such as forms, report-writers and graphical user Relational algebra is a procedural query language. It gives a step by user usually interacts with the client process.
interface facilities. step process to obtain the result of the query. It uses operators to 2. A Server consists of one or more computers or is a computer process or application
Advantages of Client-server Architecture: perform queries. that provides services to clients. A server is also known as back-end application, as
1. It provides better user interface. 1 select operation- The select operation selects tuples that satisfy a the server process provides the background services for the client processes.
2. Client-server increase the overall performance of DBMS. 3. In client-server given predicate. 3. Communications Middleware is any process(es) through which clients and servers
architecture single copy of DBMS is shared. 4. Client-server is used to develop highly It is denoted by sigma (σ) is used for selection prediction r is used for communicate with each other. The communication middleware is usually associated
complex applications. It also reduces cost. relation with
Disadvantages of Client-server Architecture: p is used as a propositional logic formula which may use connectors like: AND
1. If server is failed, there is loss of data. OR and NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.
2. Too many request form the clients may lead to network congestion.
Deadlock detection and recovery scheme. DEADLOCK HANDLING & Deadlock Prevention
EXPLAIN DISTRIBUTED DATABASE SYSTEM AND ITS TYPES
.This is one method for dealing with deadlock and it allows the system to enter .A system is in a deadlock state if there exist a set of transactions such that
a deadlock state, and then try to recover using deadlock detection and every transaction in the set is waiting for another transaction in the set. Ans A distributed database is basically a database that is not limited to one
deadlock recovery scheme. If the probability that system enters deadlock state .If (To. T1, Tn) is the set of transactions such that To is waiting for a data item system, it is spread over different sites, i.e., on multiple computers or over a
network of computers. A distributed database system is located on various sites
is relatively low, this method is efficient. that is held by T₁ and T₁ is waiting for a data item that is held by T₂... and Tn-1 that don't share physical components. This may be required when a particular
.An algorithm that examines the state of the system is invoked periodically to is waiting for a data item that is held by T. To is waiting for a data item that is database needs to be accessed by various users globally. It needs to be managed
determine whether a deadlock has occurred. If it has occurred then the system held by To. Hence, none of the transactions can make progress in such such that for the users it looks like one single database.
It's type
must attempt to recover from the deadlock. situation. That is the system is in deadlock state. 1. Homogeneous - In a homogeneous database, all different sites store database
1. Deadlock Detection:-Deadlock detection is periodic check by the DBMS to Deadlock Prevention identically. The operating system, database management system, and the data
determine if the waiting line for some resource exceeds a predetermined limit. Deadlock prevention methods gurantee that deadlocks cannot occur in the structures used all are the same at all sites. Hence, they're easy to manage.
Deadlocks can be detected using a directed graph called wait for graph. first place. Thus the transactions manager checks a transaction when it is first
[Link] -In a heterogeneous distributed database, different sites can use
2. Recovery From Deadlock:-The most common solution to recover from initiated and does not permit it to precede it may cause a deadlock. different schema and software that can lead to problems in query processing
deadlock is to rollback one or more transactions to break the deadlock. There are two approaches to deadlock prevention: and transactions. Also, a particular site might be completely unaware of the
A transaction can be recovered using following three actions; 1. It ensures that no cyclic waits can occur by ordering the requests for locks or other sites. Different computers may use a different operating system, different
database application. They may even use different data models for the database.
(1) Selection of Victim:-Determine which transaction to rollback. Those requiring all locks to be acquired together.. Hence, translations are required for different sites to communicate.
transactions that will incur the minimum cost will be rolled back. 2. It performs transaction rollbacks instead of waiting for a lock, whenever the
(ii) Rollback:-One method to rollback a transaction is to abort the transaction wait could potentially result in a deadlock. The schemes under first approach:
and restart it. Lock all the data items before a transaction begins its execution.
(iii) Starvation:-It may happen that the same transaction is always selected as
victim. This results in starvation. The most common solution is to include the
number of rollbacks in the cost factor.