join in SQL Server
SQL Server Joins - Complete Interview Guide
Join Categories
ANSI Format Joins (Standard & Recommended)
Inner Join - Only matching rows from both tables
Outer Joins - Left, Right, Full (includes unmatched
rows with NULLs)
Cross Join - Cartesian product (all combinations)
Non-ANSI Concepts
Equi Join - Uses equality condition (=) - most common
Non-Equi Join - Uses non-equality conditions (>, <, !
=, BETWEEN)
Self-Join - Table joined to itself using aliases
Natural Join - Conceptual only (SQL Server has no
NATURAL JOIN keyword)
Detailed Join Types
INNER JOIN
SELECT columns FROM TableA INNER JOIN TableB ON
[Link] = [Link];
-- or simply: JOIN (defaults to INNER)
Returns: Only rows with matches in BOTH tables
Use when: Need data that exists in both related tables
Key point: Filters out non-matching records
LEFT OUTER JOIN
SELECT columns FROM LeftTable LEFT OUTER JOIN
RightTable ON condition;
-- or simply: LEFT JOIN
Returns: ALL rows from left table + matching rows
from right (NULLs for non-matches)
Use when: Need all records from primary (left) table
plus any related data
Finding left-only records: Add WHERE
[Link] IS NULL
RIGHT OUTER JOIN
SELECT columns FROM LeftTable RIGHT OUTER JOIN
RightTable ON condition;
-- or simply: RIGHT JOIN
Returns: ALL rows from right table + matching rows
from left (NULLs for non-matches)
Note: Can always be rewritten as LEFT JOIN by
swapping table order
Use when: Need all records from primary (right) table
FULL OUTER JOIN
SELECT columns FROM TableA FULL OUTER JOIN TableB ON
condition;
-- or simply: FULL JOIN
Returns: ALL rows from BOTH tables (NULLs where no
match)
Use when: Need complete picture of both datasets
Perfect for: Data reconciliation, finding discrepancies
CROSS JOIN
SELECT columns FROM TableA CROSS JOIN TableB;
Returns: Cartesian product (every row from A × every
row from B)
No ON clause - if you add ON, it becomes INNER JOIN
Result size: TableA rows × TableB rows
Use for: Test data generation, all possible
combinations
Caution: Can create huge result sets
Advanced Concepts
Self-Join Example
SELECT [Link] AS Employee, [Link]
AS Manager
FROM Employees e LEFT JOIN Employees m ON
[Link] = [Link];
Must use aliases to distinguish table instances
Use cases: Hierarchical data, employee-manager
relationships, comparing rows within same table
Multi-Table Joins
SELECT ... FROM TableA a
INNER JOIN TableB b ON [Link] = [Link]
INNER JOIN TableC c ON [Link] = [Link];
Processing: Sequential (A+B first, then result+C)
Can mix join types (INNER + LEFT, etc.)
Order matters
Key Differences: JOIN vs UNION
JOINs
Combine columns from different tables
Result has more columns (TableA cols + TableB cols)
Based on related data/conditions
UNION/UNION ALL
Combine rows from multiple SELECT statements
Same number of columns required
Compatible data types required
UNION removes duplicates, UNION ALL keeps all
Critical Points for Interviews
NULL Handling in Outer Joins
Unmatched rows get NULL values for missing table's
columns
Use IS NULL to find unmatched records
Understanding NULL behavior is crucial
Performance Considerations
INNER JOIN typically fastest
CROSS JOIN can be dangerous (exponential growth)
LEFT JOIN often preferred over RIGHT JOIN for
readability
Syntax Variations
JOIN = INNER JOIN
LEFT JOIN = LEFT OUTER JOIN
RIGHT JOIN = RIGHT OUTER JOIN
FULL JOIN = FULL OUTER JOIN
Common Interview Scenarios
[Link] unmatched records: LEFT JOIN + WHERE
[Link] IS NULL
[Link] data: Self-join with aliases
[Link] reconciliation: FULL OUTER JOIN
[Link] combinations: CROSS JOIN (use carefully)
[Link] tables: Sequential joins with mixed types
Best Practices
Always use ANSI format joins (modern standard)
Specify ON conditions clearly (except CROSS JOIN)
Use table aliases for readability
Understand when NULLs appear in results
Consider performance implications of join types
V. Key Takeaways & When to Use What (Quick Summary)
INNER JOIN: Only matching rows. Use when you need data that
has a confirmed relationship in both tables. (Most common)
LEFT OUTER JOIN: All from left, matching from right (or NULLs).
Use when the left table is primary, and you want all its rows,
plus any related data from the right.
o To get only non-matching from left: WHERE [Link]
IS NULL.
RIGHT OUTER JOIN: All from right, matching from left (or
NULLs). Use when the right table is primary. Often can be
rewritten as a LEFT JOIN.
FULL OUTER JOIN: All rows from both. Use when you need a
complete picture of both datasets, including unmatched
records from either side. Good for data reconciliation.
CROSS JOIN: Cartesian product (all combinations). Use
sparingly for specific scenarios like generating all possible
pairs or test data.
SELF JOIN: Join a table to itself using aliases. Use for
hierarchical data or intra-table comparisons.
Remember to always specify the join condition clearly in
the ON clause for ANSI joins (except CROSS JOIN). Understanding
how NULL values are handled in OUTER JOINs is crucial for correct
interpretation of results.
SQL Server Indexes
Core Concepts & Search Mechanisms
1. The Goal of an Index:
To make search operations faster.
Avoids scanning the entire table row by row.
2. How Indexes Speed Up Searches: The B-Tree (Balanced Tree)
Internally, SQL Server creates a B-Tree structure for an index.
B-Tree Structure:
o Root Node: The top-level entry point for a search.
Contains range pointers.
o Non-Leaf Nodes (Intermediate Nodes): Branch nodes that
further narrow down the search based on key values. They
point to other non-leaf nodes or leaf nodes.
o Leaf Nodes: The bottom level of the tree. What they
contain depends on the index type:
Clustered Index: Leaf nodes contain the actual data
rows.
Non-Clustered Index: Leaf nodes contain index key
values and pointers (Row ID or Clustered Key) to the
actual data rows.
Search Process with B-Tree:
1. Start at the Root Node.
2. Compare the search value with the key ranges in the root
node to determine which branch to follow.
3. Navigate through Non-Leaf Nodes, making comparisons at
each level to narrow down the path.
4. Reach a Leaf Node that contains (or points to) the desired
data.
Benefit: This hierarchical traversal drastically reduces the
number of data pages to read compared to a full table scan.
For example, to find '50':
o Root: Check if 50 <= 30 (No). Check if 50 <= 50 (Yes).
o Follow path under '50' to relevant Non-Leaf Node.
o Non-Leaf: Check if 50 <= 40 (No). Check if 50 <= 50 (Yes).
o Follow path under '50' to relevant Leaf Node.
3. Data Retrieval Mechanisms:
Table Scan (Heap Scan):
o Occurs when there is no suitable index for the query or if
the table is very small.
o The SQL Server Search Engine reads every row in the table
sequentially from beginning to end.
o Very inefficient for large tables.
o A table without a clustered index is called a Heap.
Index Scan:
o The engine traverses all leaf pages of an index.
o This is better than a table scan if the index is narrower
(fewer columns, smaller data types) than the table itself,
or if the query needs all rows but in the index's order.
o Often used when the WHERE clause isn't selective enough
for a seek, or when no WHERE clause is present but
an ORDER BY matches the index.
Index Seek:
o The most efficient way to retrieve data.
o The engine uses the B-Tree structure to navigate directly
to the rows that satisfy the WHERE clause conditions.
o Only reads the necessary index pages and data pages.
o Typically used for equality (=) or small range
(BETWEEN, <, >) predicates on indexed columns.
4. When SQL Server Uses Indexes:
SELECT, UPDATE, DELETE statements with a WHERE clause on
an indexed column.
SELECT statements with an ORDER BY clause on an indexed
column (can avoid a sort operation).
JOIN operations on indexed columns.
SQL Server's Query Optimizer decides the best execution plan
(Table Scan, Index Scan, or Index Seek).
5. General Index Creation Syntax:
CREATE [UNIQUE] [CLUSTERED | NONCLUSTERED] INDEX
<INDEX_NAME>
ON <TABLE_NAME> (<COLUMN_LIST> [ASC|DESC], ...)
[INCLUDE (<NON_KEY_COLUMN_LIST>)]
[WITH (<OPTIONS>)]
Clustered Indexes
1. Definition:
A Clustered Index defines the physical order in which data
rows are stored in the table.
The table data itself becomes the leaf level of the clustered
index.
Think of it like a phone book sorted alphabetically by last
name; the names and numbers (the data) are sorted.
2. Key Properties:
Only ONE Clustered Index per table: Because data can only be
physically sorted in one order.
Leaf Nodes Contain Actual Data: The bottom level of the B-
Tree is the table data.
Primary Key Constraint: By default, creating a Primary Key on
a table automatically creates a UNIQUE CLUSTERED INDEX on
the primary key column(s), if no other clustered index already
exists.
Clustered Table: A table with a clustered index.
Data Sorting: Records are physically sorted in the table based
on the clustered index key (e.g., Id ASC).
3. B-Tree Structure of a Clustered Index:
Root Node & Non-Leaf Nodes: Contain clustered index key
values and pointers to the next level pages.
Leaf Nodes: Contain the actual data rows of the table, sorted
by the clustered index key.
4. Example:
CREATE CLUSTERED INDEX IX_Employee_ID ON Employee(Id ASC);
This physically sorts the Employee table by the Id column in
ascending order.
5. Why Only One?
Attempting to create a second clustered index will result in an
error:
Cannot create more than one clustered index on table
'Employee'. Drop the existing clustered index
'PK__Employee__...' before creating another.
The physical storage order is already dictated by the existing
clustered index.
6. Composite Clustered Index:
A clustered index created on multiple columns.
Data is physically sorted by the first column, then by the
second column within each group of the first, and so on.
Example: CREATE CLUSTERED INDEX
IX_Employee_Gender_Salary ON Employee(Gender DESC, Salary
ASC)
o Employees will be sorted first by Gender in descending
order (e.g., Males then Females, if M > F).
o Within each gender group, employees will be sorted
by Salary in ascending order.
7. Impact on Non-Clustered Indexes:
If a table has a clustered index, the leaf nodes of any non-
clustered indexes on that table will store the clustered index
key as their row locator (pointer) instead of a direct Row ID
(RID). This is important because if the clustered key is large, it
can make all non-clustered indexes larger.
Non-Clustered Indexes
1. Definition:
A Non-Clustered Index is a separate B-Tree structure from the
data rows.
The index contains the non-clustered index key values, and
each key value has a pointer to the actual data row.
The physical order of data in the table is NOT affected by non-
clustered indexes.
Think of it like the index at the back of a textbook: entries are
sorted alphabetically, and each entry points to a page number
(the data location).
2. Key Properties:
Multiple Non-Clustered Indexes per table: A table can have up
to 999 non-clustered indexes.
Separate Structure: Stored separately from the table data,
requiring additional disk space.
Leaf Nodes Contain Pointers: The leaf nodes store the index
key values and a row locator.
o If table is a HEAP (no clustered index): The row locator is a
Row Identifier (RID) which points directly to the physical
location of the data row in the heap.
o If table has a CLUSTERED INDEX: The row locator is
the clustered index key of that row. To find the actual
data, SQL Server first seeks the non-clustered index, gets
the clustered key, and then uses the clustered key to seek
the clustered index (which contains the data). This is
called a "Key Lookup."
3. B-Tree Structure of a Non-Clustered Index:
Root Node & Non-Leaf Nodes: Contain non-clustered index key
values and pointers to the next level pages.
Leaf Nodes: Contain the non-clustered index key values and
a row locator (RID or Clustered Key) for each key. The leaf
nodes are sorted by the non-clustered index key.
4. Example:
CREATE NONCLUSTERED INDEX IX_Employee_Salary ON
Employee(Salary ASC);
This creates a separate B-Tree sorted by Salary. The leaf nodes
will contain Salary values and pointers to the corresponding
rows in the Employee table.
5. Composite Non-Clustered Index:
A non-clustered index created on multiple columns.
Example: CREATE NONCLUSTERED INDEX
IX_tblOrder_CustomerId_ProductName ON tblOrder(CustomerId
ASC, ProductName DESC)
6. Covering Query & INCLUDE Clause:
Covering Query: A query where all the columns requested in
the SELECT list, WHERE clause, and JOIN conditions are
available within the non-clustered index itself.
o SQL Server can satisfy the query entirely from the index
pages without accessing the table data (heap) or the
clustered index. This avoids Key Lookups and is very
efficient.
o Example: If IX_ProductSales_ProductID_QuantitySold exists
on (ProductID, QuantitySold), then:
SELECT ProductID, QuantitySold FROM ProductSales
WHERE ProductID = 5; is a covering query.
INCLUDE Clause: Allows you to add non-key columns to the leaf
level of a non-clustered index.
o These INCLUDEd columns are not part of the index key (so
they don't affect sort order or B-Tree navigation) but are
stored in the leaf pages.
o Used primarily to create covering indexes without making
the index key itself too wide.
o Example: CREATE NONCLUSTERED INDEX
IX_tblOrder_Cust_Prod ON tblOrder(CustomerId,
ProductName) INCLUDE (Id, ProductId);
For SELECT Id, ProductId FROM tblOrder WHERE
CustomerId = 3 AND ProductName = 'Pendrive';
The index is keyed on CustomerId,
ProductName. Id and ProductId are available at the
leaf level for covering.
7. Missing Index Details:
The Query Execution Plan can sometimes suggest creating
"missing indexes" that could improve query performance. This
is a helpful feature for performance tuning.
Unique Indexes, DML Impact & GROUP BY
1. Unique Indexes:
Enforce that all values in the indexed column(s) are unique. No
two rows can have the same value (or combination of values
for composite unique indexes) in the indexed columns.
Can be Clustered or Non-Clustered.
Primary Key: Automatically creates a UNIQUE CLUSTERED
INDEX by default (if no CI exists).
Unique Constraint: Automatically creates a UNIQUE
NONCLUSTERED INDEX by default.
NULLs: A unique index allows only ONE row to have
a NULL value in a single-column unique index. For composite
unique indexes, only one row can have NULLs in the same
combination of columns (e.g., (Val1, NULL) is distinct
from (Val2, NULL), but two (Val1, NULL) would violate
uniqueness).
Creation: CREATE UNIQUE NONCLUSTERED INDEX
UIX_Employees_FirstName_LastName ON
Employees(FirstName, LastName);
Dropping: Cannot DROP INDEX if it's enforcing a PRIMARY
KEY or UNIQUE constraint directly. You must ALTER TABLE ...
DROP CONSTRAINT ... first.
2. Unique Constraints vs. Unique Indexes:
Functionally similar: Both enforce uniqueness using an
underlying unique index.
Intent:
o Use a UNIQUE CONSTRAINT when the primary goal is data
integrity.
o Use CREATE UNIQUE INDEX when the primary goal
is performance, and uniqueness is a secondary benefit
(though it's still enforced).
Query optimizer treats them the same.
Creating a UNIQUE CONSTRAINT (e.g., ALTER TABLE Employees
ADD CONSTRAINT UQ_Employees_City UNIQUE (City)) will
create a unique non-clustered index behind the scenes.
3. IGNORE_DUP_KEY Option:
Used with CREATE UNIQUE INDEX ... WITH (IGNORE_DUP_KEY =
ON).
When inserting multiple rows and a duplicate key violation
occurs for some rows:
o IGNORE_DUP_KEY = OFF (default): The entire insert
operation fails, and no rows are inserted.
o IGNORE_DUP_KEY = ON: Only the rows that violate the
unique constraint are rejected. The non-duplicate rows are
inserted successfully. A warning is issued.
4. Impact of Indexes on DML Operations (INSERT, UPDATE, DELETE):
Slower DML: While indexes speed up reads, they can slow
down data modifications.
o INSERT: New rows must be added to the table AND to
every non-clustered index. The row's position in the
clustered index must be determined.
o DELETE: Rows must be removed from the table AND from
every non-clustered index.
o UPDATE:
If an indexed column is updated: The old index entry
must be removed, and a new one inserted in all
relevant indexes.
If a clustered index key is updated: The entire row
might physically move to a new location in the table,
which is very costly (like a DELETE then an INSERT).
This also means all non-clustered indexes pointing to
that row must update their clustered key pointer.
Page Splits:
o Occur when a data page (or index page) is full, and a new
row needs to be inserted into that page (to maintain sort
order for clustered indexes, or index key order for non-
clustered).
o SQL Server allocates a new page, moves about half the
rows from the full page to the new page, and then inserts
the new row.
o Frequent page splits lead to fragmentation and degrade
performance.
o More impactful with clustered indexes due to physical
data movement.
o Fill Factor can be configured to leave empty space on
pages to reduce page splits.
5. Indexes and GROUP BY Clause:
GROUP BY often requires sorting the data by the grouping
columns before aggregation.
Benefit of Index: An index on the GROUP BY column(s) can
allow SQL Server to skip the explicit sort step, as the data can
be read in the required order directly from the index.
Covering Index for GROUP BY: If an index contains all columns
used in the GROUP BY clause AND any columns used in
aggregate functions (e.g., SUM(QuantitySold)), it can be highly
beneficial.
GROUP BY Algorithms:
o Hash Aggregate: Builds a hash table in memory to store
groups and their aggregate values. Does not require
sorted input but needs to materialize intermediate results.
o Stream Aggregate (Sort/Group): Requires input data to be
sorted by the grouping columns. If an index provides this
order, it's very efficient (pipelined). Otherwise, an explicit
Sort operator is added.
Practical Considerations & Summary Comparison
1. When to Create Indexes:
Columns frequently used in WHERE clauses (especially with
high selectivity).
Columns used in JOIN conditions (ON clause).
Columns frequently used in ORDER BY clauses (can avoid
sorts).
Columns frequently used in GROUP BY clauses (can avoid
sorts).
Foreign Key columns are often good candidates.
2. When to be Cautious / Avoid Over-Indexing:
Tables with high DML activity (many Inserts, Updates,
Deletes): Each index adds overhead.
Columns with low cardinality (few unique values): E.g., a
Gender column with 'M', 'F', 'U'. An index might not be much
better than a table scan.
Small tables: The overhead of index maintenance might
outweigh the benefit; a table scan can be faster.
Avoid indexing every column: This leads to excessive DML
overhead and disk space usage.
3. Clustered vs. Non-Clustered Index – Which is Faster?
Clustered Index is generally faster for:
o Range queries on the clustered key (e.g., WHERE ID
BETWEEN 100 AND 200). Data is physically contiguous.
o Queries retrieving large amounts of data sorted by the
clustered key.
o Queries that involve a direct lookup on the clustered key,
as the data is at the leaf level.
Non-Clustered Index can be faster for:
o Queries that are "covered" by the non-clustered index (all
required columns are in the index).
o Exact matches on highly selective non-clustered index
keys where only a few columns are needed.
Non-clustered indexes involve an extra step (Key Lookup or
RID Lookup) if the query is not covered.
4. Key Differences: Clustered vs. Non-Clustered Index
Feature Clustered Index Non-Clustered Index
Number
One Up to 999
per Table
Physically orders
Separate logical structure. Leaf nodes
Data data rows in the
contain index keys and pointers to
Storage table. Leaf nodes
data rows.
are the data.
Does not require
additional disk
Disk space for data (it Requires additional disk space for the
Space is the data). B- index structure.
Tree structure
above leaf takes
space.
Row N/A (Leaf
Row ID (RID) if table is a HEAP, or
Locator contains the data
Clustered Index Key if table has a CI.
in Leaf row itself)
Creates a Unique
Primary N/A (Unique Constraint creates a
Clustered Index
Key Unique Non-Clustered Index by
by default (if
Default default).
none exists).
Dictates physical
Effect on Does not affect physical order of table
sort order of the
Table data.
table.
Its key is used as
the row locator in
Pointer in
other Non-
other N/A
Clustered Indexes
NCIs
on the same
table.
5. General Advantages of Indexes:
Faster record
searching: For SELECT, UPDATE, DELETE via WHERE clauses.
Faster sorting: For ORDER BY clauses, potentially avoiding a
sort operation.
Faster grouping: For GROUP BY clauses, potentially avoiding a
sort.
Enforcing uniqueness: Via Unique Indexes (often with Primary
Keys or Unique Constraints).
6. General Disadvantages of Indexes:
Additional Disk Space: Non-clustered indexes consume disk
space.
Slower DML Operations: INSERT, UPDATE, DELETE statements
become slower as indexes also need to be maintained.
Maintenance Overhead: Indexes need to be maintained (e.g.,
rebuilding/reorganizing to reduce fragmentation).
Clustered Index Key Update Cost: Updating a column that is
part of the clustered index key can be very expensive as the
row might need to physically move, and all non-clustered
indexes must update their pointers.
SQL Server Built-in Functions
Introduction & Core Character Functions
I. Overview of Functions in SQL Server
Two Main Types:
1. Built-in Functions: Pre-defined code by SQL Server for
common tasks (e.g., string manipulation, calculations).
2. User-Defined Functions (UDFs): Functions created by users
for specific business logic.
What are Built-in Functions?
o Pieces of code that take zero or more inputs (parameters).
o Always return a value.
o Can be used anywhere expressions are allowed
(e.g., SELECT list, WHERE clause).
II. Common Built-in String Functions
1. ASCII(Character_Expression)
o Purpose: Returns the ASCII (integer) code of the first
character in the expression.
o Example: SELECT ASCII('A')
Output: 65
o Key Use: Comparing characters, case-sensitive
(e.g., ASCII('A') is 65, ASCII('a') is 97).
o Example (Case Sensitivity):
SELECT ASCII('A') AS UpperCase, ASCII('a') AS LowerCase
Output: UpperCase: 65, LowerCase: 97
2. CHAR(Integer_Expression)
o Purpose: Converts an integer ASCII code to its
corresponding character. Opposite of ASCII().
o Constraint: Integer_Expression must be between 0 and
255.
o Example: SELECT CHAR(65)
Output: A
3. LTRIM(Character_Expression)
o Purpose: Removes leading blanks (spaces on the left-hand
side).
o Syntax: LTRIM(Character_Expression)
o Example: SELECT LTRIM(' Hello')
Output: Hello (leading spaces removed)
4. RTRIM(Character_Expression)
o Purpose: Removes trailing blanks (spaces on the right-
hand side).
o Syntax: RTRIM(Character_Expression)
o Example: SELECT RTRIM('Hello ')
Output: Hello (trailing spaces removed)
5. Trimming Both Sides:
o Method: Nest LTRIM and RTRIM.
o Example: SELECT LTRIM(RTRIM(' Hello '))
Output: Hello
More String Manipulation Functions
1. LOWER(Character_Expression)
o Purpose: Converts all uppercase characters in an
expression to lowercase.
o Example: SELECT LOWER('CONVERT This String Into Lower
Case')
Output: convert this string into lower case
2. UPPER(Character_Expression)
o Purpose: Converts all lowercase characters in an
expression to uppercase.
o Example: SELECT UPPER('CONVERT This String Into
upperCase')
Output: CONVERT THIS STRING INTO UPPERCASE
3. REVERSE(String_Expression)
o Purpose: Returns the character string in reverse order.
o Example: SELECT REVERSE('ABCDE')
Output: EDCBA
4. LEN(String_Expression)
o Purpose: Returns the number of characters in the string
expression.
o Crucial Note: Excludes trailing blanks, but includes leading
blanks.
o Example: SELECT LEN(' Functions ')
Output: 10 (counts ' Functions' which is 10 chars, the
3 trailing spaces are ignored)
o Example 2: SELECT LEN(' Functions')
Output: 13 (leading spaces are counted)
5. LEFT(Character_Expression, Integer_Expression)
o Purpose: Returns the left part of a character string with
the specified number of characters.
o Example: SELECT LEFT('ABCDEF', 3)
Output: ABC
6. RIGHT(Character_Expression, Integer_Expression)
o Purpose: Returns the right part of a character string with
the specified number of characters.
o Example: SELECT RIGHT('ABCDEF', 3)
Output: DEF
String Searching & Substring Functions, Intro to OVER
Clause
1. CHARINDEX(Expression_To_Find, Expression_To_Search [,
Start_Location])
o Purpose: Returns the starting position (1-based index) of
the Expression_To_Find within Expression_To_Search.
o Start_Location: Optional. Specifies the position
in Expression_To_Search where the search begins.
o Returns: Integer. 0 if not found.
oExample: SELECT CHARINDEX('@', 'sara@[Link]', 1)
Output: 5
o Example (not found): SELECT CHARINDEX('Z',
'sara@[Link]')
Output: 0
2. SUBSTRING(Expression, Start, Length)
o Purpose: Extracts a part of a string (substring)
from Expression.
o Start: Starting position (1-based index).
o Length: Number of characters to extract.
o All 3 parameters are mandatory.
o Example 1 (Simple): SELECT
SUBSTRING('info@[Link]', 6, 19)
Output: [Link]
o Example 2 (Dynamic - Get domain from email):
o SELECT SUBSTRING('info@[Link]',
o CHARINDEX('@', 'info@[Link]') + 1,
o LEN('info@[Link]') -
CHARINDEX('@', 'info@[Link]')
)
content_copydownload
Use code with [Link]
Output: [Link]
Explanation:
CHARINDEX('@', ...): Finds position of '@'.
+ 1: Start after the '@'.
LEN(...) - CHARINDEX('@', ...): Calculates length
of the domain part.
III. Window Functions: The OVER Clause
Purpose: The OVER clause defines a "window" or a set of rows
within a query result set. Window functions then operate on
this set of rows.
Key Component: PARTITION BY
o Divides the result set into partitions (groups).
o The window function is applied to each partition
independently.
o Example Concept: COUNT(Department) OVER (PARTITION
BY Department)
This would create partitions for each
unique Department.
COUNT() would then count rows within each
department partition.
Common Functions Used
with OVER: COUNT(), SUM(), AVG(), MIN(), MAX(), ROW_NUMBE
R(), RANK(), DENSE_RANK().
OVER Clause for Aggregations (vs. GROUP BY)
The Problem with GROUP BY for Mixed Detail & Aggregate Data:
If you use GROUP BY to get aggregate values
(e.g., SUM(Salary) GROUP BY Department), you cannot directly
select non-aggregated columns (e.g., EmployeeName) unless
they are also in the GROUP BY clause (which often changes the
meaning of the aggregation).
Solution: OVER Clause with Aggregate Functions
Allows you to display both aggregated values (calculated over
a partition) and non-aggregated (detail) row values in the
same result set.
Scenario: Display each employee's details along with
department-level aggregates (Total Employees, Total Salary,
Avg Salary, Min Salary, Max Salary for their department).
Achieving with GROUP BY (Less Ideal for this scenario):
o Requires a subquery for aggregates, then JOIN back to the
main table.
o Example (conceptual, based on text):
o SELECT
o [Link], [Link], [Link],
o [Link], [Link],
[Link], ...
o FROM Employees E
o INNER JOIN (
o SELECT
o Department,
o COUNT(*) AS TotalEmployees,
o SUM(Salary) AS TotalSalary,
o AVG(Salary) AS AvgSalary,
o ...
o FROM Employees
o GROUP BY Department
) AS DeptAgg ON [Link] =
[Link];
o This is more complex and often less performant.
Achieving with OVER (PARTITION BY) (Preferred):
o More concise and often more efficient.
o Example:
o SELECT
o Name,
o Salary,
o Department,
o COUNT(*) OVER (PARTITION BY Department) AS
DeptTotalEmployees,
o SUM(Salary) OVER (PARTITION BY Department) AS
DeptTotalSalary,
o AVG(Salary) OVER (PARTITION BY Department) AS
DeptAvgSalary,
o MIN(Salary) OVER (PARTITION BY Department) AS
DeptMinSalary,
o MAX(Salary) OVER (PARTITION BY Department) AS
DeptMaxSalary
FROM Employees;
o How it works: For each employee row, the aggregate
functions calculate their values based on all rows
belonging to that employee's Department partition.
ROW_NUMBER() Window Function
ROW_NUMBER() Function
Purpose: Assigns a sequential integer to each row within its
partition, starting from 1.
Syntax:
ROW_NUMBER() OVER ([PARTITION BY value_expression1
[, ...n]] ORDER BY order_by_clause)
PARTITION BY value_expression (Optional):
o Divides the result set into partitions. ROW_NUMBER() is
applied independently to each partition (i.e., numbering
restarts at 1 for each new partition).
o If omitted, the entire result set is treated as a single
partition.
ORDER BY order_by_clause (Mandatory):
o Defines the logical order of rows within each partition,
determining how the sequential numbers are assigned.
o Error if omitted: "The function 'ROW_NUMBER' must have
an OVER clause with ORDER BY."
Example 1: ROW_NUMBER() without PARTITION BY
o Treats the whole result set as a single group. Assigns
consecutive numbers based on the ORDER BY clause.
SELECT Name, Department, Salary,
ROW_NUMBER() OVER (ORDER BY Department) AS
RowNum
FROM Employees;
RANK(), DENSE_RANK(), and Their Differences
RANK() and DENSE_RANK() Functions
Purpose: Both assign a rank (sequential number starting from
1) to each row within its partition, based on the ORDER
BY clause.
Key Behavior with Ties: If two or more rows have the same
value in the ORDER BY columns, they receive the same rank.
The difference lies in how the next rank is assigned.
1. RANK() Function
Syntax: RANK() OVER ([PARTITION BY value_expression] ORDER
BY order_by_clause)
PARTITION BY (Optional): Divides data; ranking is per partition.
If omitted, entire set is one partition.
ORDER BY (Mandatory): Determines ranking order.
Tie Handling: Assigns the same rank to tied rows. Skips the
next rank(s).
o Example: 1, 1, 3, 4 (rank 2 is skipped).
Example: RANK() without PARTITION BY (ranking by Salary
DESC)
SELECT Name, Department, Salary,
RANK() OVER (ORDER BY Salary DESC) AS
SalaryRank
FROM Employees;
RANK() with PARTITION BY: Ranking restarts for each partition,
still skipping on ties within the partition.
DENSE_RANK() Function
Syntax: DENSE_RANK() OVER ([PARTITION BY value_expression]
ORDER BY order_by_clause)
PARTITION BY (Optional): Divides data; ranking is per partition.
ORDER BY (Mandatory): Determines ranking order.
Tie Handling: Assigns the same rank to tied rows. Does NOT
skip the next rank. Ranks are consecutive.
o Example: 1, 1, 2, 3 (no ranks are skipped).
Example: DENSE_RANK() without PARTITION BY (ranking by
Salary DESC)
SELECT Name, Department, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS
DenseSalaryRank
FROM Employees;
DENSE_RANK() with PARTITION BY: Ranking restarts for each
partition, no skipping on ties within the partition.
Key Difference: RANK() vs. DENSE_RANK()
The ONLY difference is how they handle ranks after ties:
o RANK(): Skips ranks after ties. (e.g., 1, 1, 3)
o DENSE_RANK(): Does NOT skip ranks after ties; ranks are
always consecutive. (e.g., 1, 1, 2)
When to use which:
Use RANK() if you need to see the "gap" created by ties,
reflecting the actual number of preceding rows if ties were
broken.
Use DENSE_RANK() if you want a continuous sequence of
ranks, regardless of ties (often preferred for top-N per group
where N is strict).
SQL Server Stored Procedures - Complete
Interview Guide
Page 1: Fundamentals and Why We Need Stored Procedures
What Happens When SQL Executes (3 Steps)
1. Syntax Checked: Validates query syntax for errors
2. Plan Selected: Chooses optimal execution plan based on indexes and table structure
3. Query Execution: Executes the query and returns results
Why Stored Procedures Are Needed
Performance Optimization: First execution goes through all 3 steps, subsequent
executions skip steps 1-2
Execution Plan Caching: Plan is stored in memory after first execution, reused for
future calls
Reduced Processing: No repeated syntax checking or plan generation
What is a Stored Procedure?
Definition: Database object containing pre-compiled queries (group of T-SQL
statements)
Structure: Block of code designed to perform specific tasks when called
Storage: Physically stored on server as database object, accessible from anywhere
Basic Syntax Structure
CREATE PROCEDURE ProcedureName
@Parameter1 DataType,
@Parameter2 DataType OUTPUT
AS
BEGIN
-- Procedure body (T-SQL statements)
END
Two Main Parts
1. Procedure Header: Everything above "AS" keyword (name, parameters)
2. Procedure Body: Everything below "AS" keyword (actual T-SQL code)
Execution Methods
1. EXEC ProcedureName
2. EXECUTE ProcedureName
3. Right-click in Object Explorer → Execute Stored Procedure
Important Naming Convention
Avoid "sp_" prefix: Reserved for system procedures
Reason: Prevents conflicts with system procedures and ambiguity
Page 2: Parameters - Input, Output, and Default Values
Input Parameters
Purpose: Bring values into procedure for execution
Default Behavior: All parameters are input by default
Example:
CREATE PROC spAddNumbers
@Num1 INT,
@Num2 INT
AS
BEGIN
PRINT @Num1 + @Num2
END
Parameter Passing Rules
1. Order Matters: Must pass values in declared order unless using parameter names
2. Named Parameters: Can pass in any order when specifying names
o EXEC spProc @Param2=Value2, @Param1=Value1
3. Error Prevention: Use parameter names to avoid type conversion errors
Output Parameters
Declaration: Use OUT or OUTPUT keyword
Purpose: Return values from procedure after execution
Key Points:
o Must assign value inside procedure
o Can return any data type (unlike return values)
o Multiple output parameters allowed
Output Parameter Example
CREATE PROC spCalculate
@Num1 INT,
@Num2 INT,
@Result INT OUTPUT
AS
BEGIN
SET @Result = @Num1 + @Num2
END
-- Execution
DECLARE @Total INT
EXEC spCalculate 10, 20, @Total OUTPUT
PRINT @Total -- Prints 30
Default Values
Syntax: Assign default value during parameter declaration
Usage: Parameter becomes optional when default provided
Example: @Parameter INT = 100
Parameter Execution Rules
1. Must declare variable first for output parameters
2. Must specify OUTPUT keyword when calling
3. Without OUTPUT keyword, variable remains NULL
4. Can mix input and output parameters
Page 3: Return Values and Temporary Stored Procedures
Return Values vs Output Parameters
Return Values Output Parameters
Only INTEGER data type Any data type
Only ONE value Multiple values possible
Indicate success/failure Return actual data
0 = Success, Non-zero = Failure Flexible usage
Return Value Example
CREATE PROC spCountEmployees
AS
BEGIN
DECLARE @Count INT
SELECT @Count = COUNT(*) FROM Employee
RETURN @Count
END
-- Execution
DECLARE @EmpCount INT
EXEC @EmpCount = spCountEmployees
Return Value Limitations
1. Cannot return non-integer values: Causes conversion errors
2. Single value only: Cannot return multiple values
3. Best Practice: Use for status indication only
Temporary Stored Procedures
Local Temporary Procedures (#)
Prefix: Single hash (#) before procedure name
Scope: Only accessible by creating connection
Lifetime: Automatically deleted when connection closes
Usage: CREATE PROC #TempProc
Global Temporary Procedures (##)
Prefix: Double hash (##) before procedure name
Scope: Accessible by all connections
Lifetime: Available until creating connection closes
Behavior: Other connections can complete execution even after creator disconnects
When to Use Temporary Procedures
Earlier SQL Server Versions: When execution plan reuse not supported for ad-hoc
queries
Session-Specific Logic: For connection-specific temporary operations
Testing: During development and testing phases
Temporary Procedure Characteristics
1. Storage: Created in tempdb database
2. Performance: Less overhead for one-time operations
3. Security: Limited scope reduces security risks
4. Cleanup: Automatic removal prevents database clutter
Page 4: System Procedures and Procedure Management
Essential System Stored Procedures
sp_help
Purpose: View information about database objects
Usage: sp_help ProcedureName or sp_help TableName
Information Provided: Parameter names, data types, object details
Shortcut: ALT+F1 when object name is highlighted
sp_helptext
Purpose: View text/source code of procedures, functions, views
Usage: sp_helptext ProcedureName
Limitation: Cannot view encrypted objects
Storage: Retrieves from syscomments system table
sp_depends
Purpose: Show dependency relationships
Usage: sp_depends ObjectName
Benefits:
o Check dependencies before dropping objects
o Understand impact of changes
o Works with tables, views, procedures
Viewing Procedure Text
1. System Procedure: sp_helptext ProcedureName
2. Object Explorer: Right-click → Script Procedure As → Create To New Query Window
3. System Table: SELECT * FROM syscomments WHERE id = OBJECT_ID('ProcName')
Procedure Modification
ALTER PROCEDURE: Modify existing procedure
Benefits: Preserves permissions and dependencies
Syntax: Same as CREATE but use ALTER keyword
Dropping Procedures
Syntax: DROP PROCEDURE ProcedureName
Alternative: DROP PROC ProcedureName
Consideration: Check dependencies first using sp_depends
Procedure Management Best Practices
1. Documentation: Use comments within procedures
2. Version Control: Track changes with ALTER statements
3. Testing: Test procedures thoroughly before deployment
4. Security: Grant minimal necessary permissions
5. Performance: Monitor execution plans and performance
Error Handling in Procedures
TRY-CATCH Blocks: Handle runtime errors gracefully
RETURN Statements: Exit procedure with status codes
RAISERROR: Generate custom error messages
Page 5: Advanced Features and Interview Key Points
Encryption and Recompile Attributes
WITH ENCRYPTION
Purpose: Encrypt procedure source code
Effect: Text becomes unreadable in syscomments
Usage: CREATE PROC ProcName WITH ENCRYPTION
Result: sp_helptext shows "The text for object is encrypted"
Use Case: Protect intellectual property in client deployments
WITH RECOMPILE
Purpose: Force recompilation on every execution
Effect: New execution plan generated each time
When Needed:
o Significant database structure changes
o New indexes added that could benefit procedure
o Database statistics changed dramatically
Caution: Use sparingly due to performance overhead
Key Advantages of Stored Procedures (Interview Focus)
1. Performance Benefits
Execution Plan Caching: Plan reused for subsequent executions
Reduced Compilation: No repeated syntax checking
Network Traffic Reduction: Only procedure name and parameters sent
Memory Efficiency: Shared execution plans
2. Security Advantages
SQL Injection Prevention: Parameters prevent malicious code injection
Permission Control: Grant access to procedure without underlying table access
Data Access Control: Limit operations through procedure logic
3. Code Reusability and Maintenance
Centralized Logic: Business rules in one location
Multiple Application Access: Same procedure used by different applications
Easy Updates: Change logic once, affects all callers
Reduced Code Duplication: Eliminates repeated SQL statements
4. Better Maintainability
Single Point of Change: Modify procedure instead of multiple queries
Version Control: Track changes systematically
Testing: Isolated testing of business logic
Common Interview Questions & Answers
Q: What's the difference between stored procedures and functions? A: Procedures can have
output parameters and don't return values directly. Functions must return a value and can be
used in SELECT statements.
Q: Can stored procedures return multiple result sets? A: Yes, procedures can return
multiple SELECT statements as separate result sets.
Q: What happens if you don't specify OUTPUT keyword when calling a procedure with
output parameters? A: The variable will remain NULL because the output value isn't
captured.
Q: Why avoid sp_ prefix for user procedures? A: sp_ is reserved for system procedures. SQL
Server checks system procedures first, causing performance overhead.
Q: When would you use temporary procedures? A: For session-specific logic, testing, or
when working with older SQL Server versions that don't cache execution plans for ad-hoc
queries.
Best Practices Summary
1. Use meaningful names without sp_ prefix
2. Include error handling with TRY-CATCH
3. Use output parameters for returning data, return values for status
4. Document parameters and functionality
5. Test thoroughly before deployment
6. Consider security implications
7. Monitor performance and execution plans
8. Use encryption for sensitive client deployments
9. Use recompile sparingly and only when necessary
[Link] system procedures for maintenance and troubleshooting
SQL Server Functions - Complete Interview
Guide
What is a Function in SQL Server?
A function is a subprogram that performs an action (like complex calculations) and returns a
result as a value. Functions can take optional parameters but must always return a value.
Types of Functions
System-Level Classification
1. System Defined Functions - Pre-built functions (e.g., SQUARE(3), GETDATE())
2. User-Defined Functions - Created by developers
User-Defined Function Types
1. Scalar Valued Functions - Return single value
2. Inline Table-Valued Functions - Return table with single SELECT
3. Multi-Statement Table-Valued Functions - Return table with multiple statements
Scalar Valued Functions
Key Points
Returns only a single (scalar) value
May or may not have parameters
Return value can be any data type except: text, ntext, image, cursor, timestamp
Must use two-part name when calling: SELECT [Link](value)
Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS return_datatype
AS
BEGIN
-- Function body
RETURN value
END
Usage Examples
Can be used in SELECT clause: SELECT [Link](DOB) FROM Employee
Can be used in WHERE clause: WHERE [Link](DOB) > 31
Inline Table-Valued Functions
Key Characteristics
Returns a table as output
Function body contains only a single SELECT statement with RETURN
Return type specified as TABLE
No BEGIN/END blocks
Structure determined by the SELECT statement
Can be used like parameterized views
Better performance than Multi-Statement functions
Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS TABLE
AS
RETURN (SELECT columns FROM table WHERE condition)
Usage
Call like a table: SELECT * FROM FN_GetStudentsByBranch('CSE')
Can be used in JOINs with other tables
Can update underlying database tables
Multi-Statement Table-Valued Functions
Key Characteristics
Returns a table but can contain multiple statements
Must define table structure in RETURNS clause
Requires BEGIN/END blocks
Gets data from table variable (not directly from base tables)
Cannot update underlying database tables
Lower performance compared to Inline functions
Syntax
CREATE FUNCTION FunctionName(@param datatype)
RETURNS @TableVariable TABLE (
Column1 datatype,
Column2 datatype
)
AS
BEGIN
-- Multiple statements
INSERT INTO @TableVariable...
RETURN
END
Key Differences: Inline vs Multi-Statement Table-Valued
Functions
Aspect Inline Multi-Statement
Table Structure Defined by SELECT Must define explicitly
Code Blocks No BEGIN/END Requires BEGIN/END
Update Capability Can update base tables Cannot update base tables
Performance Better (treated like view) Lower (treated like stored procedure)
Data Source Direct from base tables From table variable
Advanced Options
WITH ENCRYPTION
Encrypts function text
Cannot view function definition using sp_helptext
Syntax: CREATE FUNCTION ... WITH ENCRYPTION
WITH SCHEMABINDING
Binds function to referenced database objects
Prevents modification/deletion of dependent objects
Must use two-part names for tables
Syntax: CREATE FUNCTION ... WITH SCHEMABINDING
Can combine both: WITH ENCRYPTION, SCHEMABINDING
Deterministic vs Non-Deterministic Functions
Deterministic Functions
Always return same result with same input values and database state
Examples: SQUARE(), POWER(), SUM(), AVG(), COUNT()
All aggregate functions are deterministic
RAND(seed) with seed value is deterministic
Non-Deterministic Functions
May return different results even with same inputs
Examples: GETDATE(), CURRENT_TIMESTAMP, RAND() without seed
Results vary with each execution
Functions vs Stored Procedures - Critical Differences
Aspect Functions Stored Procedures
Return Value Mandatory Optional
Parameters Input only Input and Output
Operations SELECT only SELECT, INSERT, UPDATE,
DELETE
Transaction Not possible Possible
Management
Error Handling Not possible Possible
Calling Method SELECT statement EXECUTE/EXEC
Usage in SQL Can use in Cannot use in SQL statements
WHERE/HAVING/SELECT
Calling Other Can call functions only Can call both procedures and
Objects functions
Function Management Commands
Create: CREATE FUNCTION
Modify: ALTER FUNCTION FunctionName
Delete: DROP FUNCTION FunctionName
View Text: sp_helptext FunctionName
Important Interview Points
1. Functions must always return a value - this is mandatory
2. Scalar functions can be used in SELECT and WHERE clauses - stored procedures
cannot
3. Inline Table-Valued functions perform better than Multi-Statement functions
4. Only Inline Table-Valued functions can update underlying tables
5. SCHEMABINDING prevents dependent object modifications
6. Functions can only perform SELECT operations - no INSERT/UPDATE/DELETE
7. Use two-part naming convention when calling functions: [Link]
8. All aggregate functions are deterministic
9. Functions cannot perform transaction management or error handling
[Link] functions are treated like views, Multi-Statement like stored procedures
internally
SQL Server Transaction Management -
Complete Interview Guide
What is a Transaction?
A transaction is a set of SQL statements executed as one unit following the "all or
nothing" principle. Either all commands succeed or all fail with rollback. Essential
for maintaining data integrity in operations like bank transfers where multiple
related updates must all succeed together.
Transaction Management Fundamentals
Transaction management combines related operations into a single unit with clear
beginning and ending boundaries. Every transaction has two phases: beginning and
ending, and controlling these boundaries is transaction management.
Transaction Control Language (TCL) Commands
1. BEGIN TRANSACTION - Starts a transaction
2. COMMIT TRANSACTION - Saves all changes permanently to database
3. ROLLBACK TRANSACTION - Undoes all changes back to transaction
start
4. SAVE TRANSACTION - Creates savepoints for partial rollbacks
Important: TCL commands only work with DML statements (INSERT, UPDATE,
DELETE), not DDL operations like CREATE/DROP TABLE which auto-commit.
Error Handling with @@ERROR
Global variable that returns error number (0 = no error, >0 = error occurred). Used
in conditional logic to determine whether to commit or rollback transactions.
Three Types of Transaction Modes
1. Auto Commit Transaction Mode (Default)
Each SQL statement is a separate transaction
SQL Server automatically begins and ends transactions
Developer has no control over transaction boundaries
Failed statements automatically rollback, successful ones auto-commit
2. Implicit Transaction Mode
Enabled with SET IMPLICIT_TRANSACTIONS ON
SQL Server automatically begins transactions before DML statements
Developer must explicitly COMMIT or ROLLBACK
New transaction automatically starts after current one ends
Turn off with SET IMPLICIT_TRANSACTIONS OFF
3. Explicit Transaction Mode
Developer controls both beginning and ending of transactions
Most commonly used in stored procedures, triggers, and applications
Requires explicit BEGIN TRANSACTION and COMMIT/ROLLBACK
statements
Provides full control over transaction boundaries
Nested Transactions
Transactions can be placed within other transactions
Inner commits don't physically commit data (only outer commit does)
@@TRANCOUNT global variable tracks number of open transactions
Inner commits just decrement transaction count
Only outer transaction commit actually saves data permanently
Can assign names to transactions for better readability
SavePoints (Partial Rollbacks)
Created with SAVE TRANSACTION savepoint_name
Allow rolling back to specific points within a transaction
Enable partial rollbacks instead of full transaction rollback
Savepoint names limited to 32 characters
Multiple savepoints with same name possible (rollback goes to latest)
Useful for complex transactions with multiple logical steps
ACID Properties
Atomicity
All DML statements in transaction succeed or all fail
No partial execution allowed
Database maintains consistent state by rolling back failed transactions
Consistency
Database remains in consistent state before and after transaction
Transaction must follow all database rules and constraints
Violations cause automatic rollback
Isolation
Intermediate transaction states invisible to other transactions
Data modifications isolated from other concurrent transactions
Implemented through locking mechanisms
Prevents dirty reads and maintains data integrity
Durability
Committed changes are permanent and survive system failures
Data persists through power failures, crashes, or other system issues
Transaction logs ensure recovery capability
Key Interview Points to Remember
1. Transaction Purpose: Data integrity, consistency, and handling database
errors
2. Thumb Rule: Either all statements execute successfully or none execute
3. TCL Scope: Only works with DML operations, not DDL
4. Error Handling: Always check @@ERROR for transaction control decisions
5. Nested Behavior: Only outer commits are physical, inner commits just
decrement counters
6. SavePoint Usage: Enables granular rollback control within transactions
7. ACID Compliance: SQL Server follows ACID properties by default
8. Real-world Applications: Banking systems, inventory management, any
multi-step operations requiring consistency
Common Interview Scenarios
Money transfer between accounts (classic example)
Customer and address insertion (both must succeed)
Inventory updates with sales recording
Multi-table updates requiring consistency
Error recovery in stored procedures
Best Practices
Always include error handling in explicit transactions
Use meaningful savepoint names
Keep transactions as short as possible
Avoid user interaction within transactions
Use appropriate transaction isolation levels
Test rollback scenarios thoroughly
SQL Server Exception Handling - Complete
Interview Guide
Why Exception Handling is Needed
Key Problem: In SQL Server, when an error occurs, it displays the error message
but continues executing subsequent statements. This can confuse users because they
might see both error messages and incorrect results.
Example: A division by zero error still shows "RESULT IS: 0" after the error,
which shouldn't happen.
SQL Server vs Programming Languages:
Programming Languages (C#, Java): Program terminates abnormally when
exception occurs
SQL Server: Displays error but continues execution (problematic behavior)
Goal: Stop execution of error-related statements while allowing unrelated
statements to continue
Pre-2005 Exception Handling Methods
RAISERROR System Function
RAISERROR('Error Message', ErrorSeverity, ErrorState)
Three Parameters:
1. Error Message: Custom message to display (max 2047 characters)
2. Error Severity: Set to 16 for general user-correctable errors
3. Error State: Integer between 1-255 (1-127 for custom errors)
@@ERROR System Function
Returns NON-ZERO value if error exists
Returns ZERO if previous statement executed successfully
Used in SQL Server 2000 for error detection
Error Attributes in SQL Server
Every error has four attributes:
1. Error Number: Unique identifier (<50,000 for predefined, ≥50,000 for user-
defined)
2. Error Message: Brief description of the error
3. Severity Level: Importance level (0-24)
4. Error State: Arbitrary value (0-127)
Severity Level Categories:
0-9: Informational/status messages
11-16: User-correctable errors
17-19: Software errors (report to system admin)
20-24: Fatal errors (connection terminates immediately)
Explicit Error Raising Methods
1. RAISERROR Statement
RAISERROR (errorid/errormsg, SEVERITY, state) [WITH LOG]
2. THROW Statement (SQL Server 2012+)
THROW errorid, errormsg, state
Key Differences Between RAISERROR and THROW:
Aspect RAISERROR THROW
Execution Continues after error (without Terminates abnormally
Continuation try-catch)
Severity Level Can specify custom severity Default severity 16
Logging Can use WITH LOG option Cannot log to server log
Parameters Can specify either ID or Must specify both ID and
message message
RAISERROR Advanced Options
WITH LOG Option
Records error in SQL Server log file
Mandatory when severity > 20
Useful for database administrators to track fatal errors
Substitutional Parameters
RAISERROR ('THE NUMBER %d CANNOT BE DIVIDED BY %d', 16, 1, @No1, @No2) WITH
LOG
Using SysMessage Table
Store custom error messages using SP_ADDMESSAGE
Reference by error ID in RAISERROR
Delete messages using SP_DROPMESSAGE
EXEC sp_Addmessage 51000, 16, 'DIVIDE BY ONE ERROR ENCOUNTERED'
RAISERROR (51000, 16, 1) WITH LOG
EXEC sp_dropMessage 51000
TRY-CATCH Blocks (SQL Server 2005+)
Structure
BEGIN TRY
-- Statements that might throw exceptions
END TRY
BEGIN CATCH
-- Error handling code
END CATCH
Behavior:
No Error: CATCH block skipped, execution continues after CATCH
Error Occurs: Control immediately jumps to CATCH block from error line
Important: Errors trapped by CATCH are NOT returned to calling
application unless explicitly raised
System Functions in CATCH Block:
ERROR_MESSAGE(): Returns the error message text
ERROR_NUMBER(): Returns the error number
ERROR_SEVERITY(): Returns the severity level
ERROR_STATE(): Returns the error state
ERROR_LINE(): Returns the line number where error occurred
ERROR_PROCEDURE(): Returns the name of the procedure where error
occurred
Interview Key Points
When to Use Each Method:
Pre-2005: Use RAISERROR and @@ERROR
2005+: Prefer TRY-CATCH blocks (similar to modern programming
languages)
2012+: THROW statement available as alternative to RAISERROR
Best Practices:
1. Always handle division by zero scenarios
2. Use appropriate severity levels
3. Provide meaningful error messages
4. Consider logging critical errors
5. Don't let error-related statements execute after exceptions
6. Use TRY-CATCH for structured error handling
Common Interview Scenarios:
Explain difference between SQL Server and programming language error
handling
Demonstrate RAISERROR vs THROW differences
Show TRY-CATCH implementation
Explain severity levels and their significance
Discuss error logging strategies
System Tables:
[Link]: Contains all predefined error information
Use for reference and understanding error structures
SQL Server Views - Complete Interview Guide
What is a View?
Definition: A compiled SQL query that acts as a virtual table
Nature: Logical/virtual object (not physical like tables)
Data Storage: Does not store data physically by default
(except indexed views)
Function: Acts as an interface between tables and users
Dependency: Views are dependent objects - they rely on
underlying tables
Key Differences: Tables vs Views
Physical vs Virtual: Tables are physical, views are logical
Independence: Tables are independent, views depend on
base tables
Data Synchronization: Changes in tables reflect in views
and vice versa
Storage: Tables store data, views store queries
Types of Views
1. Simple Views (Updatable Views)
Based on: Single table
DML Operations: All operations allowed (SELECT, INSERT,
UPDATE, DELETE)
Also called: Updatable views or dynamic views
Example: CREATE VIEW vwAllEmployees AS SELECT * FROM
Employee
2. Complex Views
Based on: Multiple tables OR single table with special
conditions
DML Limitations: May not perform DML operations correctly
Makes a view complex:
o Multiple tables (JOINs)
o DISTINCT clause
o Aggregate functions
o GROUP BY clause
o HAVING clause
o Calculated columns
o Set operations
DML Operations on Views
Simple Views
SELECT: SELECT * FROM viewName
INSERT: INSERT INTO viewName VALUES(...)
UPDATE: UPDATE viewName SET column = value WHERE
condition
DELETE: DELETE FROM viewName WHERE condition
Complex Views
Single table update: May succeed but might not update
correctly
Multiple table update: Fails with error "View is not
updatable because modification affects multiple base tables"
Solution: Use INSTEAD OF triggers for proper updates
View Options and Features
WITH CHECK OPTION
Purpose: Prevents DML operations that violate the view's
WHERE condition
Usage: ALTER VIEW viewName AS SELECT ... WHERE
condition WITH CHECK OPTION
Effect: Ensures all inserted/updated records satisfy the view's
filter condition
WITH ENCRYPTION
Purpose: Hides the view definition text
Effect:
o Text becomes NULL in syscomments table
o sp_helptext shows "text is encrypted"
Usage: CREATE VIEW viewName WITH ENCRYPTION AS ...
WITH SCHEMABINDING
Purpose: Binds view to underlying database objects
Restrictions:
o Cannot drop or alter referenced tables
o Must specify column names (no *)
o Must use two-part naming ([Link])
Usage: CREATE VIEW viewName WITH SCHEMABINDING AS ...
Combined: Can use WITH ENCRYPTION and WITH
SCHEMABINDING together
Indexed Views
Definition and Purpose
What: Views with physical data storage through indexes
First Index: Must be a unique clustered index
Data Storage: Result set is persisted on disk
Performance: Significantly improves query performance for
JOINs and aggregations
Rules for Creating Indexed Views
1. Must use WITH SCHEMABINDING
2. Handle NULL values with ISNULL() for aggregate expressions
3. Must include COUNT_BIG(*) if GROUP BY is used
4. Use two-part naming for base tables
5. First index must be unique clustered index
Performance Considerations
SELECT Performance: Dramatically improved (logical reads
reduced significantly)
DML Performance: Significantly decreased (higher logical
reads for INSERT/UPDATE/DELETE)
Maintenance Cost: Much higher than regular table indexes
When to Use Indexed Views
Ideal for: OLAP systems (reporting and analysis)
Avoid for: OLTP systems (frequent data changes)
Best scenario: Infrequently changed underlying data
Enterprise vs Standard: Enterprise edition uses
automatically, Standard needs WITH (NOEXPAND) hint
Advantages of Views
1. Security Implementation
Row-level security: Create views with WHERE conditions to
limit data access
Column-level security: Exclude sensitive columns (like
salary) from views
Example: Create IT department-only view, hide salary
columns
2. Complexity Hiding
Simplify JOINs: Hide complex JOIN logic from end users
User-friendly: Non-IT users can query simplified views
instead of complex tables
3. Data Presentation
Aggregated data: Present summary information hiding
detailed data
Consistent interface: Provide standardized data access
patterns
Limitations and Disadvantages
Major Limitations
1. No Parameters: Cannot pass parameters to views (use Table-
Valued Functions instead)
2. ORDER BY Restrictions: Cannot use ORDER BY unless with
TOP, OFFSET, or FOR XML
3. Temporary Tables: Cannot create views based on temporary
tables
4. No Rules/Defaults: Cannot associate rules and defaults with
views
Performance Concerns
Complex views: May have poor performance with multiple
JOINs
Indexed views: High maintenance overhead for frequently
changing data
Advanced Concepts
Views on Views
Possible: Can create views based on other views
Considerations: Performance may degrade with multiple
layers
Dropping Tables with Dependent Views
Allowed: Can drop tables even with dependent views
Effect: Views become inactive but remain in database
Recovery: Views become active when table is recreated with
same structure
Table-Valued Functions as View Alternatives
Purpose: Replacement for parameterized views
Usage: CREATE FUNCTION fnName(@param) RETURNS TABLE
Advantage: Can accept parameters unlike views
Interview Tips and Key Points
Must Remember
1. View Definition: Virtual table storing compiled SQL query
2. Two Types: Simple (single table, fully updatable) vs Complex
(multiple tables/conditions, limited DML)
3. Security: Row-level and column-level security
implementation
4. Indexed Views: Physical storage for performance, high
maintenance cost
5. Limitations: No parameters, ORDER BY restrictions, no temp
tables
Common Interview Questions
Difference between simple and complex views
When to use indexed views vs regular views
Security implementation using views
Performance implications of indexed views
View limitations and workarounds
DML operations on different view types
Practical Knowledge
Creating views with various options (CHECK OPTION,
ENCRYPTION, SCHEMABINDING)
Understanding when DML operations fail on complex views
Performance tuning with indexed views
Using views for security implementation
Troubleshooting view-related issues