0% found this document useful (0 votes)
42 views47 pages

SQL Server Joins: Types & Best Practices

The document provides a comprehensive guide on SQL Server joins, detailing various types including INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, CROSS, and SELF JOIN, along with their use cases and syntax. It also covers SQL Server indexing concepts, explaining the structures and benefits of clustered and non-clustered indexes, as well as unique indexes and their implications. Key performance considerations and best practices for using joins and indexes are highlighted for effective SQL query optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views47 pages

SQL Server Joins: Types & Best Practices

The document provides a comprehensive guide on SQL Server joins, detailing various types including INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, CROSS, and SELF JOIN, along with their use cases and syntax. It also covers SQL Server indexing concepts, explaining the structures and benefits of clustered and non-clustered indexes, as well as unique indexes and their implications. Key performance considerations and best practices for using joins and indexes are highlighted for effective SQL query optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

join in SQL Server

SQL Server Joins - Complete Interview Guide


Join Categories
ANSI Format Joins (Standard & Recommended)
 Inner Join - Only matching rows from both tables

 Outer Joins - Left, Right, Full (includes unmatched

rows with NULLs)


 Cross Join - Cartesian product (all combinations)

Non-ANSI Concepts
 Equi Join - Uses equality condition (=) - most common

 Non-Equi Join - Uses non-equality conditions (>, <, !

=, BETWEEN)
 Self-Join - Table joined to itself using aliases

 Natural Join - Conceptual only (SQL Server has no

NATURAL JOIN keyword)


Detailed Join Types
INNER JOIN
SELECT columns FROM TableA INNER JOIN TableB ON
[Link] = [Link];
-- or simply: JOIN (defaults to INNER)
 Returns: Only rows with matches in BOTH tables

 Use when: Need data that exists in both related tables

 Key point: Filters out non-matching records

LEFT OUTER JOIN


SELECT columns FROM LeftTable LEFT OUTER JOIN
RightTable ON condition;
-- or simply: LEFT JOIN
 Returns: ALL rows from left table + matching rows

from right (NULLs for non-matches)


 Use when: Need all records from primary (left) table
plus any related data
 Finding left-only records: Add WHERE

[Link] IS NULL
RIGHT OUTER JOIN
SELECT columns FROM LeftTable RIGHT OUTER JOIN
RightTable ON condition;
-- or simply: RIGHT JOIN
 Returns: ALL rows from right table + matching rows

from left (NULLs for non-matches)


 Note: Can always be rewritten as LEFT JOIN by

swapping table order


 Use when: Need all records from primary (right) table

FULL OUTER JOIN


SELECT columns FROM TableA FULL OUTER JOIN TableB ON
condition;
-- or simply: FULL JOIN
 Returns: ALL rows from BOTH tables (NULLs where no

match)
 Use when: Need complete picture of both datasets

 Perfect for: Data reconciliation, finding discrepancies

CROSS JOIN
SELECT columns FROM TableA CROSS JOIN TableB;
 Returns: Cartesian product (every row from A × every

row from B)
 No ON clause - if you add ON, it becomes INNER JOIN

 Result size: TableA rows × TableB rows

 Use for: Test data generation, all possible

combinations
 Caution: Can create huge result sets
Advanced Concepts
Self-Join Example
SELECT [Link] AS Employee, [Link]
AS Manager
FROM Employees e LEFT JOIN Employees m ON
[Link] = [Link];
 Must use aliases to distinguish table instances

 Use cases: Hierarchical data, employee-manager

relationships, comparing rows within same table


Multi-Table Joins
SELECT ... FROM TableA a
INNER JOIN TableB b ON [Link] = [Link]
INNER JOIN TableC c ON [Link] = [Link];
 Processing: Sequential (A+B first, then result+C)

 Can mix join types (INNER + LEFT, etc.)

 Order matters

Key Differences: JOIN vs UNION


JOINs
 Combine columns from different tables

 Result has more columns (TableA cols + TableB cols)

 Based on related data/conditions

UNION/UNION ALL
 Combine rows from multiple SELECT statements

 Same number of columns required

 Compatible data types required

 UNION removes duplicates, UNION ALL keeps all

Critical Points for Interviews


NULL Handling in Outer Joins
 Unmatched rows get NULL values for missing table's

columns
Use IS NULL to find unmatched records
 Understanding NULL behavior is crucial

Performance Considerations
 INNER JOIN typically fastest

 CROSS JOIN can be dangerous (exponential growth)

 LEFT JOIN often preferred over RIGHT JOIN for

readability
Syntax Variations
 JOIN = INNER JOIN

 LEFT JOIN = LEFT OUTER JOIN

 RIGHT JOIN = RIGHT OUTER JOIN

 FULL JOIN = FULL OUTER JOIN

Common Interview Scenarios


[Link] unmatched records: LEFT JOIN + WHERE
[Link] IS NULL
[Link] data: Self-join with aliases
[Link] reconciliation: FULL OUTER JOIN
[Link] combinations: CROSS JOIN (use carefully)
[Link] tables: Sequential joins with mixed types
Best Practices
 Always use ANSI format joins (modern standard)

 Specify ON conditions clearly (except CROSS JOIN)

 Use table aliases for readability

 Understand when NULLs appear in results

 Consider performance implications of join types

V. Key Takeaways & When to Use What (Quick Summary)


 INNER JOIN: Only matching rows. Use when you need data that
has a confirmed relationship in both tables. (Most common)
 LEFT OUTER JOIN: All from left, matching from right (or NULLs).
Use when the left table is primary, and you want all its rows,
plus any related data from the right.
o To get only non-matching from left: WHERE [Link]
IS NULL.
 RIGHT OUTER JOIN: All from right, matching from left (or
NULLs). Use when the right table is primary. Often can be
rewritten as a LEFT JOIN.
 FULL OUTER JOIN: All rows from both. Use when you need a
complete picture of both datasets, including unmatched
records from either side. Good for data reconciliation.
 CROSS JOIN: Cartesian product (all combinations). Use
sparingly for specific scenarios like generating all possible
pairs or test data.
 SELF JOIN: Join a table to itself using aliases. Use for
hierarchical data or intra-table comparisons.
Remember to always specify the join condition clearly in
the ON clause for ANSI joins (except CROSS JOIN). Understanding
how NULL values are handled in OUTER JOINs is crucial for correct
interpretation of results.

SQL Server Indexes


Core Concepts & Search Mechanisms

1. The Goal of an Index:


 To make search operations faster.
 Avoids scanning the entire table row by row.
2. How Indexes Speed Up Searches: The B-Tree (Balanced Tree)
 Internally, SQL Server creates a B-Tree structure for an index.
 B-Tree Structure:
o Root Node: The top-level entry point for a search.
Contains range pointers.
o Non-Leaf Nodes (Intermediate Nodes): Branch nodes that
further narrow down the search based on key values. They
point to other non-leaf nodes or leaf nodes.
o Leaf Nodes: The bottom level of the tree. What they
contain depends on the index type:
 Clustered Index: Leaf nodes contain the actual data
rows.
 Non-Clustered Index: Leaf nodes contain index key
values and pointers (Row ID or Clustered Key) to the
actual data rows.
 Search Process with B-Tree:
1. Start at the Root Node.
2. Compare the search value with the key ranges in the root
node to determine which branch to follow.
3. Navigate through Non-Leaf Nodes, making comparisons at
each level to narrow down the path.
4. Reach a Leaf Node that contains (or points to) the desired
data.
 Benefit: This hierarchical traversal drastically reduces the
number of data pages to read compared to a full table scan.
For example, to find '50':
o Root: Check if 50 <= 30 (No). Check if 50 <= 50 (Yes).
o Follow path under '50' to relevant Non-Leaf Node.
o Non-Leaf: Check if 50 <= 40 (No). Check if 50 <= 50 (Yes).
o Follow path under '50' to relevant Leaf Node.
3. Data Retrieval Mechanisms:
 Table Scan (Heap Scan):
o Occurs when there is no suitable index for the query or if
the table is very small.
o The SQL Server Search Engine reads every row in the table
sequentially from beginning to end.
o Very inefficient for large tables.
o A table without a clustered index is called a Heap.
 Index Scan:
o The engine traverses all leaf pages of an index.
o This is better than a table scan if the index is narrower
(fewer columns, smaller data types) than the table itself,
or if the query needs all rows but in the index's order.
o Often used when the WHERE clause isn't selective enough
for a seek, or when no WHERE clause is present but
an ORDER BY matches the index.
 Index Seek:
o The most efficient way to retrieve data.
o The engine uses the B-Tree structure to navigate directly
to the rows that satisfy the WHERE clause conditions.
o Only reads the necessary index pages and data pages.
o Typically used for equality (=) or small range
(BETWEEN, <, >) predicates on indexed columns.
4. When SQL Server Uses Indexes:
 SELECT, UPDATE, DELETE statements with a WHERE clause on
an indexed column.
 SELECT statements with an ORDER BY clause on an indexed
column (can avoid a sort operation).
 JOIN operations on indexed columns.
 SQL Server's Query Optimizer decides the best execution plan
(Table Scan, Index Scan, or Index Seek).
5. General Index Creation Syntax:
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX
<INDEX_NAME>
ON <TABLE_NAME> (<COLUMN_LIST> [ASC|DESC], ...)
[INCLUDE (<NON_KEY_COLUMN_LIST>)]
[WITH (<OPTIONS>)]

Clustered Indexes

1. Definition:
 A Clustered Index defines the physical order in which data
rows are stored in the table.
 The table data itself becomes the leaf level of the clustered
index.
 Think of it like a phone book sorted alphabetically by last
name; the names and numbers (the data) are sorted.
2. Key Properties:
 Only ONE Clustered Index per table: Because data can only be
physically sorted in one order.
 Leaf Nodes Contain Actual Data: The bottom level of the B-
Tree is the table data.
 Primary Key Constraint: By default, creating a Primary Key on
a table automatically creates a UNIQUE CLUSTERED INDEX on
the primary key column(s), if no other clustered index already
exists.
 Clustered Table: A table with a clustered index.
 Data Sorting: Records are physically sorted in the table based
on the clustered index key (e.g., Id ASC).
3. B-Tree Structure of a Clustered Index:
 Root Node & Non-Leaf Nodes: Contain clustered index key
values and pointers to the next level pages.
 Leaf Nodes: Contain the actual data rows of the table, sorted
by the clustered index key.
4. Example:
CREATE CLUSTERED INDEX IX_Employee_ID ON Employee(Id ASC);
 This physically sorts the Employee table by the Id column in
ascending order.
5. Why Only One?
 Attempting to create a second clustered index will result in an
error:
Cannot create more than one clustered index on table
'Employee'. Drop the existing clustered index
'PK__Employee__...' before creating another.
 The physical storage order is already dictated by the existing
clustered index.
6. Composite Clustered Index:
 A clustered index created on multiple columns.
 Data is physically sorted by the first column, then by the
second column within each group of the first, and so on.
 Example: CREATE CLUSTERED INDEX
IX_Employee_Gender_Salary ON Employee(Gender DESC, Salary
ASC)
o Employees will be sorted first by Gender in descending
order (e.g., Males then Females, if M > F).
o Within each gender group, employees will be sorted
by Salary in ascending order.
7. Impact on Non-Clustered Indexes:
 If a table has a clustered index, the leaf nodes of any non-
clustered indexes on that table will store the clustered index
key as their row locator (pointer) instead of a direct Row ID
(RID). This is important because if the clustered key is large, it
can make all non-clustered indexes larger.

Non-Clustered Indexes

1. Definition:
 A Non-Clustered Index is a separate B-Tree structure from the
data rows.
 The index contains the non-clustered index key values, and
each key value has a pointer to the actual data row.
 The physical order of data in the table is NOT affected by non-
clustered indexes.
 Think of it like the index at the back of a textbook: entries are
sorted alphabetically, and each entry points to a page number
(the data location).
2. Key Properties:
 Multiple Non-Clustered Indexes per table: A table can have up
to 999 non-clustered indexes.
 Separate Structure: Stored separately from the table data,
requiring additional disk space.
 Leaf Nodes Contain Pointers: The leaf nodes store the index
key values and a row locator.
o If table is a HEAP (no clustered index): The row locator is a
Row Identifier (RID) which points directly to the physical
location of the data row in the heap.
o If table has a CLUSTERED INDEX: The row locator is
the clustered index key of that row. To find the actual
data, SQL Server first seeks the non-clustered index, gets
the clustered key, and then uses the clustered key to seek
the clustered index (which contains the data). This is
called a "Key Lookup."
3. B-Tree Structure of a Non-Clustered Index:
 Root Node & Non-Leaf Nodes: Contain non-clustered index key
values and pointers to the next level pages.
 Leaf Nodes: Contain the non-clustered index key values and
a row locator (RID or Clustered Key) for each key. The leaf
nodes are sorted by the non-clustered index key.
4. Example:
CREATE NONCLUSTERED INDEX IX_Employee_Salary ON
Employee(Salary ASC);
 This creates a separate B-Tree sorted by Salary. The leaf nodes
will contain Salary values and pointers to the corresponding
rows in the Employee table.
5. Composite Non-Clustered Index:
 A non-clustered index created on multiple columns.
 Example: CREATE NONCLUSTERED INDEX
IX_tblOrder_CustomerId_ProductName ON tblOrder(CustomerId
ASC, ProductName DESC)
6. Covering Query & INCLUDE Clause:
 Covering Query: A query where all the columns requested in
the SELECT list, WHERE clause, and JOIN conditions are
available within the non-clustered index itself.
o SQL Server can satisfy the query entirely from the index
pages without accessing the table data (heap) or the
clustered index. This avoids Key Lookups and is very
efficient.
o Example: If IX_ProductSales_ProductID_QuantitySold exists
on (ProductID, QuantitySold), then:
SELECT ProductID, QuantitySold FROM ProductSales
WHERE ProductID = 5; is a covering query.
 INCLUDE Clause: Allows you to add non-key columns to the leaf
level of a non-clustered index.
o These INCLUDEd columns are not part of the index key (so
they don't affect sort order or B-Tree navigation) but are
stored in the leaf pages.
o Used primarily to create covering indexes without making
the index key itself too wide.
o Example: CREATE NONCLUSTERED INDEX
IX_tblOrder_Cust_Prod ON tblOrder(CustomerId,
ProductName) INCLUDE (Id, ProductId);
 For SELECT Id, ProductId FROM tblOrder WHERE
CustomerId = 3 AND ProductName = 'Pendrive';
 The index is keyed on CustomerId,
ProductName. Id and ProductId are available at the
leaf level for covering.
7. Missing Index Details:
 The Query Execution Plan can sometimes suggest creating
"missing indexes" that could improve query performance. This
is a helpful feature for performance tuning.

Unique Indexes, DML Impact & GROUP BY

1. Unique Indexes:
 Enforce that all values in the indexed column(s) are unique. No
two rows can have the same value (or combination of values
for composite unique indexes) in the indexed columns.
 Can be Clustered or Non-Clustered.
 Primary Key: Automatically creates a UNIQUE CLUSTERED
INDEX by default (if no CI exists).
 Unique Constraint: Automatically creates a UNIQUE
NONCLUSTERED INDEX by default.
 NULLs: A unique index allows only ONE row to have
a NULL value in a single-column unique index. For composite
unique indexes, only one row can have NULLs in the same
combination of columns (e.g., (Val1, NULL) is distinct
from (Val2, NULL), but two (Val1, NULL) would violate
uniqueness).
 Creation: CREATE UNIQUE NONCLUSTERED INDEX
UIX_Employees_FirstName_LastName ON
Employees(FirstName, LastName);
 Dropping: Cannot DROP INDEX if it's enforcing a PRIMARY
KEY or UNIQUE constraint directly. You must ALTER TABLE ...
DROP CONSTRAINT ... first.
2. Unique Constraints vs. Unique Indexes:
 Functionally similar: Both enforce uniqueness using an
underlying unique index.
 Intent:
o Use a UNIQUE CONSTRAINT when the primary goal is data
integrity.
o Use CREATE UNIQUE INDEX when the primary goal
is performance, and uniqueness is a secondary benefit
(though it's still enforced).
 Query optimizer treats them the same.
 Creating a UNIQUE CONSTRAINT (e.g., ALTER TABLE Employees
ADD CONSTRAINT UQ_Employees_City UNIQUE (City)) will
create a unique non-clustered index behind the scenes.
3. IGNORE_DUP_KEY Option:
 Used with CREATE UNIQUE INDEX ... WITH (IGNORE_DUP_KEY =
ON).
 When inserting multiple rows and a duplicate key violation
occurs for some rows:
o IGNORE_DUP_KEY = OFF (default): The entire insert
operation fails, and no rows are inserted.
o IGNORE_DUP_KEY = ON: Only the rows that violate the
unique constraint are rejected. The non-duplicate rows are
inserted successfully. A warning is issued.
4. Impact of Indexes on DML Operations (INSERT, UPDATE, DELETE):
 Slower DML: While indexes speed up reads, they can slow
down data modifications.
o INSERT: New rows must be added to the table AND to
every non-clustered index. The row's position in the
clustered index must be determined.
o DELETE: Rows must be removed from the table AND from
every non-clustered index.
o UPDATE:
 If an indexed column is updated: The old index entry
must be removed, and a new one inserted in all
relevant indexes.
 If a clustered index key is updated: The entire row
might physically move to a new location in the table,
which is very costly (like a DELETE then an INSERT).
This also means all non-clustered indexes pointing to
that row must update their clustered key pointer.
 Page Splits:
o Occur when a data page (or index page) is full, and a new
row needs to be inserted into that page (to maintain sort
order for clustered indexes, or index key order for non-
clustered).
o SQL Server allocates a new page, moves about half the
rows from the full page to the new page, and then inserts
the new row.
o Frequent page splits lead to fragmentation and degrade
performance.
o More impactful with clustered indexes due to physical
data movement.
o Fill Factor can be configured to leave empty space on
pages to reduce page splits.
5. Indexes and GROUP BY Clause:
 GROUP BY often requires sorting the data by the grouping
columns before aggregation.
 Benefit of Index: An index on the GROUP BY column(s) can
allow SQL Server to skip the explicit sort step, as the data can
be read in the required order directly from the index.
 Covering Index for GROUP BY: If an index contains all columns
used in the GROUP BY clause AND any columns used in
aggregate functions (e.g., SUM(QuantitySold)), it can be highly
beneficial.
 GROUP BY Algorithms:
o Hash Aggregate: Builds a hash table in memory to store
groups and their aggregate values. Does not require
sorted input but needs to materialize intermediate results.
o Stream Aggregate (Sort/Group): Requires input data to be
sorted by the grouping columns. If an index provides this
order, it's very efficient (pipelined). Otherwise, an explicit
Sort operator is added.

Practical Considerations & Summary Comparison

1. When to Create Indexes:


 Columns frequently used in WHERE clauses (especially with
high selectivity).
 Columns used in JOIN conditions (ON clause).
 Columns frequently used in ORDER BY clauses (can avoid
sorts).
 Columns frequently used in GROUP BY clauses (can avoid
sorts).
 Foreign Key columns are often good candidates.
2. When to be Cautious / Avoid Over-Indexing:
 Tables with high DML activity (many Inserts, Updates,
Deletes): Each index adds overhead.
 Columns with low cardinality (few unique values): E.g., a
Gender column with 'M', 'F', 'U'. An index might not be much
better than a table scan.
 Small tables: The overhead of index maintenance might
outweigh the benefit; a table scan can be faster.
 Avoid indexing every column: This leads to excessive DML
overhead and disk space usage.
3. Clustered vs. Non-Clustered Index – Which is Faster?
 Clustered Index is generally faster for:
o Range queries on the clustered key (e.g., WHERE ID
BETWEEN 100 AND 200). Data is physically contiguous.
o Queries retrieving large amounts of data sorted by the
clustered key.
o Queries that involve a direct lookup on the clustered key,
as the data is at the leaf level.
 Non-Clustered Index can be faster for:
o Queries that are "covered" by the non-clustered index (all
required columns are in the index).
o Exact matches on highly selective non-clustered index
keys where only a few columns are needed.
 Non-clustered indexes involve an extra step (Key Lookup or
RID Lookup) if the query is not covered.

4. Key Differences: Clustered vs. Non-Clustered Index


Feature Clustered Index Non-Clustered Index
Number
One Up to 999
per Table
Physically orders
Separate logical structure. Leaf nodes
Data data rows in the
contain index keys and pointers to
Storage table. Leaf nodes
data rows.
are the data.
Does not require
additional disk
Disk space for data (it Requires additional disk space for the
Space is the data). B- index structure.
Tree structure
above leaf takes
space.
Row N/A (Leaf
Row ID (RID) if table is a HEAP, or
Locator contains the data
Clustered Index Key if table has a CI.
in Leaf row itself)
Creates a Unique
Primary N/A (Unique Constraint creates a
Clustered Index
Key Unique Non-Clustered Index by
by default (if
Default default).
none exists).
Dictates physical
Effect on Does not affect physical order of table
sort order of the
Table data.
table.
Its key is used as
the row locator in
Pointer in
other Non-
other N/A
Clustered Indexes
NCIs
on the same
table.

5. General Advantages of Indexes:


 Faster record
searching: For SELECT, UPDATE, DELETE via WHERE clauses.
 Faster sorting: For ORDER BY clauses, potentially avoiding a
sort operation.
 Faster grouping: For GROUP BY clauses, potentially avoiding a
sort.
 Enforcing uniqueness: Via Unique Indexes (often with Primary
Keys or Unique Constraints).
6. General Disadvantages of Indexes:
 Additional Disk Space: Non-clustered indexes consume disk
space.
 Slower DML Operations: INSERT, UPDATE, DELETE statements
become slower as indexes also need to be maintained.
 Maintenance Overhead: Indexes need to be maintained (e.g.,
rebuilding/reorganizing to reduce fragmentation).
 Clustered Index Key Update Cost: Updating a column that is
part of the clustered index key can be very expensive as the
row might need to physically move, and all non-clustered
indexes must update their pointers.
SQL Server Built-in Functions
Introduction & Core Character Functions

I. Overview of Functions in SQL Server


 Two Main Types:
1. Built-in Functions: Pre-defined code by SQL Server for
common tasks (e.g., string manipulation, calculations).
2. User-Defined Functions (UDFs): Functions created by users
for specific business logic.
 What are Built-in Functions?
o Pieces of code that take zero or more inputs (parameters).
o Always return a value.
o Can be used anywhere expressions are allowed
(e.g., SELECT list, WHERE clause).

II. Common Built-in String Functions


1. ASCII(Character_Expression)
o Purpose: Returns the ASCII (integer) code of the first
character in the expression.
o Example: SELECT ASCII('A')
 Output: 65
o Key Use: Comparing characters, case-sensitive
(e.g., ASCII('A') is 65, ASCII('a') is 97).
o Example (Case Sensitivity):
SELECT ASCII('A') AS UpperCase, ASCII('a') AS LowerCase
 Output: UpperCase: 65, LowerCase: 97
2. CHAR(Integer_Expression)
o Purpose: Converts an integer ASCII code to its
corresponding character. Opposite of ASCII().
o Constraint: Integer_Expression must be between 0 and
255.
o Example: SELECT CHAR(65)
 Output: A
3. LTRIM(Character_Expression)
o Purpose: Removes leading blanks (spaces on the left-hand
side).
o Syntax: LTRIM(Character_Expression)
o Example: SELECT LTRIM(' Hello')
 Output: Hello (leading spaces removed)
4. RTRIM(Character_Expression)
o Purpose: Removes trailing blanks (spaces on the right-
hand side).
o Syntax: RTRIM(Character_Expression)
o Example: SELECT RTRIM('Hello ')
 Output: Hello (trailing spaces removed)
5. Trimming Both Sides:
o Method: Nest LTRIM and RTRIM.
o Example: SELECT LTRIM(RTRIM(' Hello '))
 Output: Hello

More String Manipulation Functions

1. LOWER(Character_Expression)
o Purpose: Converts all uppercase characters in an
expression to lowercase.
o Example: SELECT LOWER('CONVERT This String Into Lower
Case')
 Output: convert this string into lower case
2. UPPER(Character_Expression)
o Purpose: Converts all lowercase characters in an
expression to uppercase.
o Example: SELECT UPPER('CONVERT This String Into
upperCase')
 Output: CONVERT THIS STRING INTO UPPERCASE
3. REVERSE(String_Expression)
o Purpose: Returns the character string in reverse order.
o Example: SELECT REVERSE('ABCDE')
 Output: EDCBA
4. LEN(String_Expression)
o Purpose: Returns the number of characters in the string
expression.
o Crucial Note: Excludes trailing blanks, but includes leading
blanks.
o Example: SELECT LEN(' Functions ')
 Output: 10 (counts ' Functions' which is 10 chars, the
3 trailing spaces are ignored)
o Example 2: SELECT LEN(' Functions')
 Output: 13 (leading spaces are counted)
5. LEFT(Character_Expression, Integer_Expression)
o Purpose: Returns the left part of a character string with
the specified number of characters.
o Example: SELECT LEFT('ABCDEF', 3)
 Output: ABC
6. RIGHT(Character_Expression, Integer_Expression)
o Purpose: Returns the right part of a character string with
the specified number of characters.
o Example: SELECT RIGHT('ABCDEF', 3)
 Output: DEF

String Searching & Substring Functions, Intro to OVER


Clause

1. CHARINDEX(Expression_To_Find, Expression_To_Search [,
Start_Location])
o Purpose: Returns the starting position (1-based index) of
the Expression_To_Find within Expression_To_Search.
o Start_Location: Optional. Specifies the position
in Expression_To_Search where the search begins.
o Returns: Integer. 0 if not found.
oExample: SELECT CHARINDEX('@', 'sara@[Link]', 1)
 Output: 5
o Example (not found): SELECT CHARINDEX('Z',
'sara@[Link]')
 Output: 0
2. SUBSTRING(Expression, Start, Length)
o Purpose: Extracts a part of a string (substring)
from Expression.
o Start: Starting position (1-based index).
o Length: Number of characters to extract.
o All 3 parameters are mandatory.
o Example 1 (Simple): SELECT
SUBSTRING('info@[Link]', 6, 19)
 Output: [Link]
o Example 2 (Dynamic - Get domain from email):
o SELECT SUBSTRING('info@[Link]',
o CHARINDEX('@', 'info@[Link]') + 1,
o LEN('info@[Link]') -
CHARINDEX('@', 'info@[Link]')
)
content_copydownload
Use code with [Link]
 Output: [Link]
 Explanation:
 CHARINDEX('@', ...): Finds position of '@'.
 + 1: Start after the '@'.
 LEN(...) - CHARINDEX('@', ...): Calculates length
of the domain part.

III. Window Functions: The OVER Clause


 Purpose: The OVER clause defines a "window" or a set of rows
within a query result set. Window functions then operate on
this set of rows.
 Key Component: PARTITION BY
o Divides the result set into partitions (groups).
o The window function is applied to each partition
independently.
o Example Concept: COUNT(Department) OVER (PARTITION
BY Department)
 This would create partitions for each
unique Department.
 COUNT() would then count rows within each
department partition.
 Common Functions Used
with OVER: COUNT(), SUM(), AVG(), MIN(), MAX(), ROW_NUMBE
R(), RANK(), DENSE_RANK().

OVER Clause for Aggregations (vs. GROUP BY)

The Problem with GROUP BY for Mixed Detail & Aggregate Data:
 If you use GROUP BY to get aggregate values
(e.g., SUM(Salary) GROUP BY Department), you cannot directly
select non-aggregated columns (e.g., EmployeeName) unless
they are also in the GROUP BY clause (which often changes the
meaning of the aggregation).
Solution: OVER Clause with Aggregate Functions
 Allows you to display both aggregated values (calculated over
a partition) and non-aggregated (detail) row values in the
same result set.
 Scenario: Display each employee's details along with
department-level aggregates (Total Employees, Total Salary,
Avg Salary, Min Salary, Max Salary for their department).
 Achieving with GROUP BY (Less Ideal for this scenario):
o Requires a subquery for aggregates, then JOIN back to the
main table.
o Example (conceptual, based on text):
o SELECT
o [Link], [Link], [Link],
o [Link], [Link],
[Link], ...
o FROM Employees E
o INNER JOIN (
o SELECT
o Department,
o COUNT(*) AS TotalEmployees,
o SUM(Salary) AS TotalSalary,
o AVG(Salary) AS AvgSalary,
o ...
o FROM Employees
o GROUP BY Department
) AS DeptAgg ON [Link] =
[Link];
o This is more complex and often less performant.
 Achieving with OVER (PARTITION BY) (Preferred):
o More concise and often more efficient.
o Example:
o SELECT
o Name,
o Salary,
o Department,
o COUNT(*) OVER (PARTITION BY Department) AS
DeptTotalEmployees,
o SUM(Salary) OVER (PARTITION BY Department) AS
DeptTotalSalary,
o AVG(Salary) OVER (PARTITION BY Department) AS
DeptAvgSalary,
o MIN(Salary) OVER (PARTITION BY Department) AS
DeptMinSalary,
o MAX(Salary) OVER (PARTITION BY Department) AS
DeptMaxSalary
FROM Employees;
o How it works: For each employee row, the aggregate
functions calculate their values based on all rows
belonging to that employee's Department partition.

ROW_NUMBER() Window Function

ROW_NUMBER() Function
 Purpose: Assigns a sequential integer to each row within its
partition, starting from 1.
 Syntax:
ROW_NUMBER() OVER ([PARTITION BY value_expression1
[, ...n]] ORDER BY order_by_clause)
 PARTITION BY value_expression (Optional):
o Divides the result set into partitions. ROW_NUMBER() is
applied independently to each partition (i.e., numbering
restarts at 1 for each new partition).
o If omitted, the entire result set is treated as a single
partition.
 ORDER BY order_by_clause (Mandatory):
o Defines the logical order of rows within each partition,
determining how the sequential numbers are assigned.
o Error if omitted: "The function 'ROW_NUMBER' must have
an OVER clause with ORDER BY."
 Example 1: ROW_NUMBER() without PARTITION BY
o Treats the whole result set as a single group. Assigns
consecutive numbers based on the ORDER BY clause.
 SELECT Name, Department, Salary,
 ROW_NUMBER() OVER (ORDER BY Department) AS
RowNum
FROM Employees;
RANK(), DENSE_RANK(), and Their Differences

RANK() and DENSE_RANK() Functions


 Purpose: Both assign a rank (sequential number starting from
1) to each row within its partition, based on the ORDER
BY clause.
 Key Behavior with Ties: If two or more rows have the same
value in the ORDER BY columns, they receive the same rank.
The difference lies in how the next rank is assigned.
1. RANK() Function
 Syntax: RANK() OVER ([PARTITION BY value_expression] ORDER
BY order_by_clause)
 PARTITION BY (Optional): Divides data; ranking is per partition.
If omitted, entire set is one partition.
 ORDER BY (Mandatory): Determines ranking order.
 Tie Handling: Assigns the same rank to tied rows. Skips the
next rank(s).
o Example: 1, 1, 3, 4 (rank 2 is skipped).
 Example: RANK() without PARTITION BY (ranking by Salary
DESC)
 SELECT Name, Department, Salary,
 RANK() OVER (ORDER BY Salary DESC) AS
SalaryRank
FROM Employees;
 RANK() with PARTITION BY: Ranking restarts for each partition,
still skipping on ties within the partition.

DENSE_RANK() Function
 Syntax: DENSE_RANK() OVER ([PARTITION BY value_expression]
ORDER BY order_by_clause)
 PARTITION BY (Optional): Divides data; ranking is per partition.
 ORDER BY (Mandatory): Determines ranking order.
 Tie Handling: Assigns the same rank to tied rows. Does NOT
skip the next rank. Ranks are consecutive.
o Example: 1, 1, 2, 3 (no ranks are skipped).
 Example: DENSE_RANK() without PARTITION BY (ranking by
Salary DESC)
 SELECT Name, Department, Salary,
 DENSE_RANK() OVER (ORDER BY Salary DESC) AS
DenseSalaryRank
FROM Employees;
 DENSE_RANK() with PARTITION BY: Ranking restarts for each
partition, no skipping on ties within the partition.

Key Difference: RANK() vs. DENSE_RANK()


 The ONLY difference is how they handle ranks after ties:
o RANK(): Skips ranks after ties. (e.g., 1, 1, 3)
o DENSE_RANK(): Does NOT skip ranks after ties; ranks are
always consecutive. (e.g., 1, 1, 2)
When to use which:
 Use RANK() if you need to see the "gap" created by ties,
reflecting the actual number of preceding rows if ties were
broken.
 Use DENSE_RANK() if you want a continuous sequence of
ranks, regardless of ties (often preferred for top-N per group
where N is strict).
SQL Server Stored Procedures - Complete
Interview Guide
Page 1: Fundamentals and Why We Need Stored Procedures
What Happens When SQL Executes (3 Steps)

1. Syntax Checked: Validates query syntax for errors


2. Plan Selected: Chooses optimal execution plan based on indexes and table structure
3. Query Execution: Executes the query and returns results

Why Stored Procedures Are Needed

 Performance Optimization: First execution goes through all 3 steps, subsequent


executions skip steps 1-2
 Execution Plan Caching: Plan is stored in memory after first execution, reused for
future calls
 Reduced Processing: No repeated syntax checking or plan generation

What is a Stored Procedure?

 Definition: Database object containing pre-compiled queries (group of T-SQL


statements)
 Structure: Block of code designed to perform specific tasks when called
 Storage: Physically stored on server as database object, accessible from anywhere

Basic Syntax Structure


CREATE PROCEDURE ProcedureName
@Parameter1 DataType,
@Parameter2 DataType OUTPUT
AS
BEGIN
-- Procedure body (T-SQL statements)
END

Two Main Parts

1. Procedure Header: Everything above "AS" keyword (name, parameters)


2. Procedure Body: Everything below "AS" keyword (actual T-SQL code)

Execution Methods

1. EXEC ProcedureName
2. EXECUTE ProcedureName
3. Right-click in Object Explorer → Execute Stored Procedure

Important Naming Convention

 Avoid "sp_" prefix: Reserved for system procedures


 Reason: Prevents conflicts with system procedures and ambiguity

Page 2: Parameters - Input, Output, and Default Values


Input Parameters

 Purpose: Bring values into procedure for execution


 Default Behavior: All parameters are input by default
 Example:
CREATE PROC spAddNumbers
@Num1 INT,
@Num2 INT
AS
BEGIN
PRINT @Num1 + @Num2
END

Parameter Passing Rules

1. Order Matters: Must pass values in declared order unless using parameter names
2. Named Parameters: Can pass in any order when specifying names
o EXEC spProc @Param2=Value2, @Param1=Value1
3. Error Prevention: Use parameter names to avoid type conversion errors

Output Parameters

 Declaration: Use OUT or OUTPUT keyword


 Purpose: Return values from procedure after execution
 Key Points:
o Must assign value inside procedure
o Can return any data type (unlike return values)
o Multiple output parameters allowed

Output Parameter Example


CREATE PROC spCalculate
@Num1 INT,
@Num2 INT,
@Result INT OUTPUT
AS
BEGIN
SET @Result = @Num1 + @Num2
END

-- Execution
DECLARE @Total INT
EXEC spCalculate 10, 20, @Total OUTPUT
PRINT @Total -- Prints 30

Default Values

 Syntax: Assign default value during parameter declaration


 Usage: Parameter becomes optional when default provided
 Example: @Parameter INT = 100

Parameter Execution Rules

1. Must declare variable first for output parameters


2. Must specify OUTPUT keyword when calling
3. Without OUTPUT keyword, variable remains NULL
4. Can mix input and output parameters

Page 3: Return Values and Temporary Stored Procedures


Return Values vs Output Parameters

Return Values Output Parameters


Only INTEGER data type Any data type
Only ONE value Multiple values possible
Indicate success/failure Return actual data
0 = Success, Non-zero = Failure Flexible usage

Return Value Example


CREATE PROC spCountEmployees
AS
BEGIN
DECLARE @Count INT
SELECT @Count = COUNT(*) FROM Employee
RETURN @Count
END

-- Execution
DECLARE @EmpCount INT
EXEC @EmpCount = spCountEmployees

Return Value Limitations


1. Cannot return non-integer values: Causes conversion errors
2. Single value only: Cannot return multiple values
3. Best Practice: Use for status indication only

Temporary Stored Procedures

Local Temporary Procedures (#)

 Prefix: Single hash (#) before procedure name


 Scope: Only accessible by creating connection
 Lifetime: Automatically deleted when connection closes
 Usage: CREATE PROC #TempProc

Global Temporary Procedures (##)

 Prefix: Double hash (##) before procedure name


 Scope: Accessible by all connections
 Lifetime: Available until creating connection closes
 Behavior: Other connections can complete execution even after creator disconnects

When to Use Temporary Procedures

 Earlier SQL Server Versions: When execution plan reuse not supported for ad-hoc
queries
 Session-Specific Logic: For connection-specific temporary operations
 Testing: During development and testing phases

Temporary Procedure Characteristics

1. Storage: Created in tempdb database


2. Performance: Less overhead for one-time operations
3. Security: Limited scope reduces security risks
4. Cleanup: Automatic removal prevents database clutter

Page 4: System Procedures and Procedure Management


Essential System Stored Procedures

sp_help

 Purpose: View information about database objects


 Usage: sp_help ProcedureName or sp_help TableName
 Information Provided: Parameter names, data types, object details
 Shortcut: ALT+F1 when object name is highlighted
sp_helptext

 Purpose: View text/source code of procedures, functions, views


 Usage: sp_helptext ProcedureName
 Limitation: Cannot view encrypted objects
 Storage: Retrieves from syscomments system table

sp_depends

 Purpose: Show dependency relationships


 Usage: sp_depends ObjectName
 Benefits:
o Check dependencies before dropping objects
o Understand impact of changes
o Works with tables, views, procedures

Viewing Procedure Text

1. System Procedure: sp_helptext ProcedureName


2. Object Explorer: Right-click → Script Procedure As → Create To New Query Window
3. System Table: SELECT * FROM syscomments WHERE id = OBJECT_ID('ProcName')

Procedure Modification

 ALTER PROCEDURE: Modify existing procedure


 Benefits: Preserves permissions and dependencies
 Syntax: Same as CREATE but use ALTER keyword

Dropping Procedures

 Syntax: DROP PROCEDURE ProcedureName


 Alternative: DROP PROC ProcedureName
 Consideration: Check dependencies first using sp_depends

Procedure Management Best Practices

1. Documentation: Use comments within procedures


2. Version Control: Track changes with ALTER statements
3. Testing: Test procedures thoroughly before deployment
4. Security: Grant minimal necessary permissions
5. Performance: Monitor execution plans and performance

Error Handling in Procedures

 TRY-CATCH Blocks: Handle runtime errors gracefully


 RETURN Statements: Exit procedure with status codes
 RAISERROR: Generate custom error messages

Page 5: Advanced Features and Interview Key Points


Encryption and Recompile Attributes

WITH ENCRYPTION

 Purpose: Encrypt procedure source code


 Effect: Text becomes unreadable in syscomments
 Usage: CREATE PROC ProcName WITH ENCRYPTION
 Result: sp_helptext shows "The text for object is encrypted"
 Use Case: Protect intellectual property in client deployments

WITH RECOMPILE

 Purpose: Force recompilation on every execution


 Effect: New execution plan generated each time
 When Needed:
o Significant database structure changes
o New indexes added that could benefit procedure
o Database statistics changed dramatically
 Caution: Use sparingly due to performance overhead

Key Advantages of Stored Procedures (Interview Focus)

1. Performance Benefits

 Execution Plan Caching: Plan reused for subsequent executions


 Reduced Compilation: No repeated syntax checking
 Network Traffic Reduction: Only procedure name and parameters sent
 Memory Efficiency: Shared execution plans

2. Security Advantages

 SQL Injection Prevention: Parameters prevent malicious code injection


 Permission Control: Grant access to procedure without underlying table access
 Data Access Control: Limit operations through procedure logic

3. Code Reusability and Maintenance

 Centralized Logic: Business rules in one location


 Multiple Application Access: Same procedure used by different applications
 Easy Updates: Change logic once, affects all callers
 Reduced Code Duplication: Eliminates repeated SQL statements

4. Better Maintainability

 Single Point of Change: Modify procedure instead of multiple queries


 Version Control: Track changes systematically
 Testing: Isolated testing of business logic

Common Interview Questions & Answers

Q: What's the difference between stored procedures and functions? A: Procedures can have
output parameters and don't return values directly. Functions must return a value and can be
used in SELECT statements.

Q: Can stored procedures return multiple result sets? A: Yes, procedures can return
multiple SELECT statements as separate result sets.

Q: What happens if you don't specify OUTPUT keyword when calling a procedure with
output parameters? A: The variable will remain NULL because the output value isn't
captured.

Q: Why avoid sp_ prefix for user procedures? A: sp_ is reserved for system procedures. SQL
Server checks system procedures first, causing performance overhead.

Q: When would you use temporary procedures? A: For session-specific logic, testing, or
when working with older SQL Server versions that don't cache execution plans for ad-hoc
queries.

Best Practices Summary

1. Use meaningful names without sp_ prefix


2. Include error handling with TRY-CATCH
3. Use output parameters for returning data, return values for status
4. Document parameters and functionality
5. Test thoroughly before deployment
6. Consider security implications
7. Monitor performance and execution plans
8. Use encryption for sensitive client deployments
9. Use recompile sparingly and only when necessary
[Link] system procedures for maintenance and troubleshooting

SQL Server Functions - Complete Interview


Guide
What is a Function in SQL Server?
A function is a subprogram that performs an action (like complex calculations) and returns a
result as a value. Functions can take optional parameters but must always return a value.

Types of Functions
System-Level Classification

1. System Defined Functions - Pre-built functions (e.g., SQUARE(3), GETDATE())


2. User-Defined Functions - Created by developers

User-Defined Function Types

1. Scalar Valued Functions - Return single value


2. Inline Table-Valued Functions - Return table with single SELECT
3. Multi-Statement Table-Valued Functions - Return table with multiple statements

Scalar Valued Functions


Key Points

 Returns only a single (scalar) value


 May or may not have parameters
 Return value can be any data type except: text, ntext, image, cursor, timestamp
 Must use two-part name when calling: SELECT [Link](value)

Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS return_datatype
AS
BEGIN
-- Function body
RETURN value
END

Usage Examples

 Can be used in SELECT clause: SELECT [Link](DOB) FROM Employee


 Can be used in WHERE clause: WHERE [Link](DOB) > 31

Inline Table-Valued Functions


Key Characteristics
 Returns a table as output
 Function body contains only a single SELECT statement with RETURN
 Return type specified as TABLE
 No BEGIN/END blocks
 Structure determined by the SELECT statement
 Can be used like parameterized views
 Better performance than Multi-Statement functions

Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS TABLE
AS
RETURN (SELECT columns FROM table WHERE condition)

Usage

 Call like a table: SELECT * FROM FN_GetStudentsByBranch('CSE')


 Can be used in JOINs with other tables
 Can update underlying database tables

Multi-Statement Table-Valued Functions


Key Characteristics

 Returns a table but can contain multiple statements


 Must define table structure in RETURNS clause
 Requires BEGIN/END blocks
 Gets data from table variable (not directly from base tables)
 Cannot update underlying database tables
 Lower performance compared to Inline functions

Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS @TableVariable TABLE (
Column1 datatype,
Column2 datatype
)
AS
BEGIN
-- Multiple statements
INSERT INTO @TableVariable...
RETURN
END

Key Differences: Inline vs Multi-Statement Table-Valued


Functions
Aspect Inline Multi-Statement
Table Structure Defined by SELECT Must define explicitly
Code Blocks No BEGIN/END Requires BEGIN/END
Update Capability Can update base tables Cannot update base tables
Performance Better (treated like view) Lower (treated like stored procedure)
Data Source Direct from base tables From table variable

Advanced Options
WITH ENCRYPTION

 Encrypts function text


 Cannot view function definition using sp_helptext
 Syntax: CREATE FUNCTION ... WITH ENCRYPTION

WITH SCHEMABINDING

 Binds function to referenced database objects


 Prevents modification/deletion of dependent objects
 Must use two-part names for tables
 Syntax: CREATE FUNCTION ... WITH SCHEMABINDING
 Can combine both: WITH ENCRYPTION, SCHEMABINDING

Deterministic vs Non-Deterministic Functions


Deterministic Functions

 Always return same result with same input values and database state
 Examples: SQUARE(), POWER(), SUM(), AVG(), COUNT()
 All aggregate functions are deterministic
 RAND(seed) with seed value is deterministic

Non-Deterministic Functions

 May return different results even with same inputs


 Examples: GETDATE(), CURRENT_TIMESTAMP, RAND() without seed
 Results vary with each execution

Functions vs Stored Procedures - Critical Differences


Aspect Functions Stored Procedures
Return Value Mandatory Optional
Parameters Input only Input and Output
Operations SELECT only SELECT, INSERT, UPDATE,
DELETE
Transaction Not possible Possible
Management
Error Handling Not possible Possible
Calling Method SELECT statement EXECUTE/EXEC
Usage in SQL Can use in Cannot use in SQL statements
WHERE/HAVING/SELECT
Calling Other Can call functions only Can call both procedures and
Objects functions

Function Management Commands


 Create: CREATE FUNCTION
 Modify: ALTER FUNCTION FunctionName
 Delete: DROP FUNCTION FunctionName
 View Text: sp_helptext FunctionName

Important Interview Points


1. Functions must always return a value - this is mandatory
2. Scalar functions can be used in SELECT and WHERE clauses - stored procedures
cannot
3. Inline Table-Valued functions perform better than Multi-Statement functions
4. Only Inline Table-Valued functions can update underlying tables
5. SCHEMABINDING prevents dependent object modifications
6. Functions can only perform SELECT operations - no INSERT/UPDATE/DELETE
7. Use two-part naming convention when calling functions: [Link]
8. All aggregate functions are deterministic
9. Functions cannot perform transaction management or error handling
[Link] functions are treated like views, Multi-Statement like stored procedures
internally

SQL Server Transaction Management -


Complete Interview Guide
What is a Transaction?
A transaction is a set of SQL statements executed as one unit following the "all or
nothing" principle. Either all commands succeed or all fail with rollback. Essential
for maintaining data integrity in operations like bank transfers where multiple
related updates must all succeed together.

Transaction Management Fundamentals


Transaction management combines related operations into a single unit with clear
beginning and ending boundaries. Every transaction has two phases: beginning and
ending, and controlling these boundaries is transaction management.

Transaction Control Language (TCL) Commands


1. BEGIN TRANSACTION - Starts a transaction
2. COMMIT TRANSACTION - Saves all changes permanently to database
3. ROLLBACK TRANSACTION - Undoes all changes back to transaction
start
4. SAVE TRANSACTION - Creates savepoints for partial rollbacks

Important: TCL commands only work with DML statements (INSERT, UPDATE,
DELETE), not DDL operations like CREATE/DROP TABLE which auto-commit.

Error Handling with @@ERROR


Global variable that returns error number (0 = no error, >0 = error occurred). Used
in conditional logic to determine whether to commit or rollback transactions.

Three Types of Transaction Modes


1. Auto Commit Transaction Mode (Default)

 Each SQL statement is a separate transaction


 SQL Server automatically begins and ends transactions
 Developer has no control over transaction boundaries
 Failed statements automatically rollback, successful ones auto-commit

2. Implicit Transaction Mode

 Enabled with SET IMPLICIT_TRANSACTIONS ON


 SQL Server automatically begins transactions before DML statements
 Developer must explicitly COMMIT or ROLLBACK
 New transaction automatically starts after current one ends
 Turn off with SET IMPLICIT_TRANSACTIONS OFF

3. Explicit Transaction Mode

 Developer controls both beginning and ending of transactions


 Most commonly used in stored procedures, triggers, and applications
 Requires explicit BEGIN TRANSACTION and COMMIT/ROLLBACK
statements
 Provides full control over transaction boundaries

Nested Transactions
 Transactions can be placed within other transactions
 Inner commits don't physically commit data (only outer commit does)
 @@TRANCOUNT global variable tracks number of open transactions
 Inner commits just decrement transaction count
 Only outer transaction commit actually saves data permanently
 Can assign names to transactions for better readability

SavePoints (Partial Rollbacks)


 Created with SAVE TRANSACTION savepoint_name
 Allow rolling back to specific points within a transaction
 Enable partial rollbacks instead of full transaction rollback
 Savepoint names limited to 32 characters
 Multiple savepoints with same name possible (rollback goes to latest)
 Useful for complex transactions with multiple logical steps

ACID Properties
Atomicity

 All DML statements in transaction succeed or all fail


 No partial execution allowed
 Database maintains consistent state by rolling back failed transactions

Consistency
 Database remains in consistent state before and after transaction
 Transaction must follow all database rules and constraints
 Violations cause automatic rollback

Isolation

 Intermediate transaction states invisible to other transactions


 Data modifications isolated from other concurrent transactions
 Implemented through locking mechanisms
 Prevents dirty reads and maintains data integrity

Durability

 Committed changes are permanent and survive system failures


 Data persists through power failures, crashes, or other system issues
 Transaction logs ensure recovery capability

Key Interview Points to Remember


1. Transaction Purpose: Data integrity, consistency, and handling database
errors
2. Thumb Rule: Either all statements execute successfully or none execute
3. TCL Scope: Only works with DML operations, not DDL
4. Error Handling: Always check @@ERROR for transaction control decisions
5. Nested Behavior: Only outer commits are physical, inner commits just
decrement counters
6. SavePoint Usage: Enables granular rollback control within transactions
7. ACID Compliance: SQL Server follows ACID properties by default
8. Real-world Applications: Banking systems, inventory management, any
multi-step operations requiring consistency

Common Interview Scenarios


 Money transfer between accounts (classic example)
 Customer and address insertion (both must succeed)
 Inventory updates with sales recording
 Multi-table updates requiring consistency
 Error recovery in stored procedures

Best Practices
 Always include error handling in explicit transactions
 Use meaningful savepoint names
 Keep transactions as short as possible
 Avoid user interaction within transactions
 Use appropriate transaction isolation levels
 Test rollback scenarios thoroughly

SQL Server Exception Handling - Complete


Interview Guide
Why Exception Handling is Needed
Key Problem: In SQL Server, when an error occurs, it displays the error message
but continues executing subsequent statements. This can confuse users because they
might see both error messages and incorrect results.

Example: A division by zero error still shows "RESULT IS: 0" after the error,
which shouldn't happen.

SQL Server vs Programming Languages:

 Programming Languages (C#, Java): Program terminates abnormally when


exception occurs
 SQL Server: Displays error but continues execution (problematic behavior)
 Goal: Stop execution of error-related statements while allowing unrelated
statements to continue

Pre-2005 Exception Handling Methods


RAISERROR System Function
RAISERROR('Error Message', ErrorSeverity, ErrorState)

Three Parameters:
1. Error Message: Custom message to display (max 2047 characters)
2. Error Severity: Set to 16 for general user-correctable errors
3. Error State: Integer between 1-255 (1-127 for custom errors)

@@ERROR System Function

 Returns NON-ZERO value if error exists


 Returns ZERO if previous statement executed successfully
 Used in SQL Server 2000 for error detection

Error Attributes in SQL Server


Every error has four attributes:

1. Error Number: Unique identifier (<50,000 for predefined, ≥50,000 for user-
defined)
2. Error Message: Brief description of the error
3. Severity Level: Importance level (0-24)
4. Error State: Arbitrary value (0-127)

Severity Level Categories:

 0-9: Informational/status messages


 11-16: User-correctable errors
 17-19: Software errors (report to system admin)
 20-24: Fatal errors (connection terminates immediately)

Explicit Error Raising Methods


1. RAISERROR Statement
RAISERROR (errorid/errormsg, SEVERITY, state) [WITH LOG]

2. THROW Statement (SQL Server 2012+)


THROW errorid, errormsg, state

Key Differences Between RAISERROR and THROW:

Aspect RAISERROR THROW


Execution Continues after error (without Terminates abnormally
Continuation try-catch)
Severity Level Can specify custom severity Default severity 16
Logging Can use WITH LOG option Cannot log to server log
Parameters Can specify either ID or Must specify both ID and
message message

RAISERROR Advanced Options


WITH LOG Option

 Records error in SQL Server log file


 Mandatory when severity > 20
 Useful for database administrators to track fatal errors

Substitutional Parameters
RAISERROR ('THE NUMBER %d CANNOT BE DIVIDED BY %d', 16, 1, @No1, @No2) WITH
LOG

Using SysMessage Table

 Store custom error messages using SP_ADDMESSAGE


 Reference by error ID in RAISERROR
 Delete messages using SP_DROPMESSAGE
EXEC sp_Addmessage 51000, 16, 'DIVIDE BY ONE ERROR ENCOUNTERED'
RAISERROR (51000, 16, 1) WITH LOG
EXEC sp_dropMessage 51000

TRY-CATCH Blocks (SQL Server 2005+)


Structure
BEGIN TRY
-- Statements that might throw exceptions
END TRY
BEGIN CATCH
-- Error handling code
END CATCH

Behavior:

 No Error: CATCH block skipped, execution continues after CATCH


 Error Occurs: Control immediately jumps to CATCH block from error line
 Important: Errors trapped by CATCH are NOT returned to calling
application unless explicitly raised

System Functions in CATCH Block:

 ERROR_MESSAGE(): Returns the error message text


 ERROR_NUMBER(): Returns the error number
 ERROR_SEVERITY(): Returns the severity level
 ERROR_STATE(): Returns the error state
 ERROR_LINE(): Returns the line number where error occurred
 ERROR_PROCEDURE(): Returns the name of the procedure where error
occurred

Interview Key Points


When to Use Each Method:

 Pre-2005: Use RAISERROR and @@ERROR


 2005+: Prefer TRY-CATCH blocks (similar to modern programming
languages)
 2012+: THROW statement available as alternative to RAISERROR

Best Practices:

1. Always handle division by zero scenarios


2. Use appropriate severity levels
3. Provide meaningful error messages
4. Consider logging critical errors
5. Don't let error-related statements execute after exceptions
6. Use TRY-CATCH for structured error handling

Common Interview Scenarios:

 Explain difference between SQL Server and programming language error


handling
 Demonstrate RAISERROR vs THROW differences
 Show TRY-CATCH implementation
 Explain severity levels and their significance
 Discuss error logging strategies

System Tables:
 [Link]: Contains all predefined error information
 Use for reference and understanding error structures

SQL Server Views - Complete Interview Guide


What is a View?
 Definition: A compiled SQL query that acts as a virtual table

 Nature: Logical/virtual object (not physical like tables)

 Data Storage: Does not store data physically by default

(except indexed views)


 Function: Acts as an interface between tables and users

 Dependency: Views are dependent objects - they rely on

underlying tables
Key Differences: Tables vs Views
 Physical vs Virtual: Tables are physical, views are logical

 Independence: Tables are independent, views depend on

base tables
 Data Synchronization: Changes in tables reflect in views

and vice versa


 Storage: Tables store data, views store queries

Types of Views
1. Simple Views (Updatable Views)
 Based on: Single table

 DML Operations: All operations allowed (SELECT, INSERT,

UPDATE, DELETE)
 Also called: Updatable views or dynamic views

 Example: CREATE VIEW vwAllEmployees AS SELECT * FROM

Employee
2. Complex Views
 Based on: Multiple tables OR single table with special

conditions
 DML Limitations: May not perform DML operations correctly

 Makes a view complex:

o Multiple tables (JOINs)

o DISTINCT clause
o Aggregate functions
o GROUP BY clause

o HAVING clause

o Calculated columns

o Set operations

DML Operations on Views


Simple Views
 SELECT: SELECT * FROM viewName

 INSERT: INSERT INTO viewName VALUES(...)

 UPDATE: UPDATE viewName SET column = value WHERE

condition
 DELETE: DELETE FROM viewName WHERE condition

Complex Views
 Single table update: May succeed but might not update

correctly
 Multiple table update: Fails with error "View is not

updatable because modification affects multiple base tables"


 Solution: Use INSTEAD OF triggers for proper updates

View Options and Features


WITH CHECK OPTION
 Purpose: Prevents DML operations that violate the view's

WHERE condition
 Usage: ALTER VIEW viewName AS SELECT ... WHERE

condition WITH CHECK OPTION


 Effect: Ensures all inserted/updated records satisfy the view's

filter condition
WITH ENCRYPTION
 Purpose: Hides the view definition text

 Effect:

o Text becomes NULL in syscomments table

o sp_helptext shows "text is encrypted"

 Usage: CREATE VIEW viewName WITH ENCRYPTION AS ...

WITH SCHEMABINDING
 Purpose: Binds view to underlying database objects
 Restrictions:

o Cannot drop or alter referenced tables

o Must specify column names (no *)

o Must use two-part naming ([Link])

 Usage: CREATE VIEW viewName WITH SCHEMABINDING AS ...

 Combined: Can use WITH ENCRYPTION and WITH

SCHEMABINDING together
Indexed Views
Definition and Purpose
 What: Views with physical data storage through indexes

 First Index: Must be a unique clustered index

 Data Storage: Result set is persisted on disk

 Performance: Significantly improves query performance for

JOINs and aggregations


Rules for Creating Indexed Views
1. Must use WITH SCHEMABINDING
2. Handle NULL values with ISNULL() for aggregate expressions
3. Must include COUNT_BIG(*) if GROUP BY is used
4. Use two-part naming for base tables
5. First index must be unique clustered index
Performance Considerations
 SELECT Performance: Dramatically improved (logical reads

reduced significantly)
 DML Performance: Significantly decreased (higher logical

reads for INSERT/UPDATE/DELETE)


 Maintenance Cost: Much higher than regular table indexes

When to Use Indexed Views


 Ideal for: OLAP systems (reporting and analysis)

 Avoid for: OLTP systems (frequent data changes)

 Best scenario: Infrequently changed underlying data

 Enterprise vs Standard: Enterprise edition uses

automatically, Standard needs WITH (NOEXPAND) hint


Advantages of Views
1. Security Implementation
 Row-level security: Create views with WHERE conditions to

limit data access


 Column-level security: Exclude sensitive columns (like

salary) from views


 Example: Create IT department-only view, hide salary

columns
2. Complexity Hiding
 Simplify JOINs: Hide complex JOIN logic from end users

 User-friendly: Non-IT users can query simplified views

instead of complex tables


3. Data Presentation
 Aggregated data: Present summary information hiding

detailed data
 Consistent interface: Provide standardized data access

patterns
Limitations and Disadvantages
Major Limitations
1. No Parameters: Cannot pass parameters to views (use Table-
Valued Functions instead)
2. ORDER BY Restrictions: Cannot use ORDER BY unless with
TOP, OFFSET, or FOR XML
3. Temporary Tables: Cannot create views based on temporary
tables
4. No Rules/Defaults: Cannot associate rules and defaults with
views
Performance Concerns
 Complex views: May have poor performance with multiple

JOINs
 Indexed views: High maintenance overhead for frequently

changing data
Advanced Concepts
Views on Views
 Possible: Can create views based on other views

 Considerations: Performance may degrade with multiple

layers
Dropping Tables with Dependent Views
 Allowed: Can drop tables even with dependent views

 Effect: Views become inactive but remain in database

 Recovery: Views become active when table is recreated with

same structure
Table-Valued Functions as View Alternatives
 Purpose: Replacement for parameterized views

 Usage: CREATE FUNCTION fnName(@param) RETURNS TABLE

 Advantage: Can accept parameters unlike views

Interview Tips and Key Points


Must Remember
1. View Definition: Virtual table storing compiled SQL query
2. Two Types: Simple (single table, fully updatable) vs Complex
(multiple tables/conditions, limited DML)
3. Security: Row-level and column-level security
implementation
4. Indexed Views: Physical storage for performance, high
maintenance cost
5. Limitations: No parameters, ORDER BY restrictions, no temp
tables
Common Interview Questions
 Difference between simple and complex views

 When to use indexed views vs regular views

 Security implementation using views

 Performance implications of indexed views

 View limitations and workarounds

 DML operations on different view types

Practical Knowledge
 Creating views with various options (CHECK OPTION,
ENCRYPTION, SCHEMABINDING)
 Understanding when DML operations fail on complex views
 Performance tuning with indexed views
 Using views for security implementation
 Troubleshooting view-related issues

You might also like