SQL Interview Question
[Link] is SQL?
[Link] (Structured Query Language) is a programming language used for
managing
relational databases. It allows users to store, manipulate, and retrieve data from
databases.
[Link] are the different types of SQL statements?
[Link] statements can be categorized into three types:
i. Data Definition Language (DDL): Used for creating, altering, and dropping
database objects.
[Link] Manipulation Language (DML): Used for querying, inserting, updating, and
deleting datAns.
[Link] Control Language (DCL): Used for controlling access to the database,
granting or revoking privileges.
[Link] is a primary key?
Ans.A primary key is a column or a set of columns that uniquely identifies each
record
in a table. It ensures data integrity and allows efficient retrieval of datAns.
[Link] is a foreign key?
Ans.A foreign key is a column or a set of columns in a table that refers to the
primary
key of another table. It establishes a relationship between the two tables.
[Link] is a composite key?
Ans.A composite key is a primary key composed of two or more columns.
Together,
these columns uniquely identify each record in a table.
[Link] is the difference between DELETE and TRUNCATE?
[Link] is a DML statement used to remove specific rows from a table,
whereas
TRUNCATE is a DDL statement used to remove all rows from a table, effectively
resetting the table.
[Link] is a subquery?
Ans.A subquery is a query nested within another query. It can be used to retrieve
data
from one table based on values from another table or perform complex
calculations.
[Link] is the difference between a subquery and a join?
Ans. A subquery is a query nested within another query, whereas a join is used to
combine rows from two or more tables based on related columns.
9. What is a self-join?
Ans. A self-join is a join operation where a table is joined with itself. It is useful
when
you want to compare rows within the same table.
10. What are the different types of JOIN operations?
Ans. The different types of JOIN operations are:
i. INNER JOIN: Returns only the matching rows from both tables.
ii. LEFT JOIN: Returns all rows from the left table and matching rows from the
right
table.
iii. RIGHT JOIN: Returns all rows from the right table and matching rows from the
left table.
iv. FULL JOIN: Returns all rows from both tables.
11. What is normalization in SQL?
Ans. Normalization is the process of organizing data in a database to eliminate
redundancy and dependency issues. It involves splitting tables into smaller, more
manageable entities.
12. What are the different normal forms in database normalization?
Ans. The different normal forms are:
i. First Normal Form (1NF): Eliminates duplicate rows and ensures atomicity of
values.
ii. Second Normal Form (2NF): Ensures that each non-key column depends on the
entire primary key.
iii. Third Normal Form (3NF): Ensures that each non-key column depends only on
the primary key and not on other non-key columns.
iv. Fourth Normal Form (4NF): Eliminates multi-valued dependencies.
v. Fifth Normal Form (5NF): Eliminates join dependencies.
13. What is an index?
Ans. An index is a database structure that improves the speed of data retrieval
operations on database tables. It allows faster searching, sorting, and filtering of
datAns.
14. What is a clustered index?
Ans. A clustered index determines the physical order of data in a table. Each table
can
have only one clustered index, and it is generally created on the primary key
column(s).
15. What is a non-clustered index?
Ans. A non-clustered index is a separate structure from the table that contains a
sorted list of selected columns. It enhances the performance of searching and
filtering operations.
16. What is the difference between a primary key and a unique key?
Ans. A primary key is a column or a set of columns that uniquely identifies each
record
in a table and cannot contain NULL values. A unique key, on the other hand,
allows
NULL values and enforces uniqueness but does not automatically define the
primary identifier of a table.
17. What is ACID in database transactions?
Ans. ACID stands for Atomicity, Consistency, Isolation, and Durability. It is a set of
properties that ensure reliability and integrity in database transactions.
18. What is the difference between UNION and UNION ALL?
Ans. UNION combines the result sets of two or more SELECT statements and
removes
duplicates, whereas UNION ALL combines the result sets without removing
duplicates.
19. What is a view?
Ans. A view is a virtual table derived from one or more tables. It does not store
data but
provides a way to present data in a customized or simplified manner.
20. What is a stored procedure?
Ans. A stored procedure is a precompiled set of SQL statements that performs a
specific task. It can be called and executed multiple times with different
parameters.
21. What is a trigger?
Ans. A trigger is a set of SQL statements that are automatically executed in
response to
a specific event, such as INSERT, UPDATE, or DELETE operations on a table.
22. What is a transaction?
Ans. A transaction is a logical unit of work that consists of one or more database
operations. It ensures that all operations within the transaction are treated as a
single unit, either all succeeding or all failing.
23. What is a deadlock?
Ans. A deadlock is a situation where two or more transactions are unable to
proceed
because each is waiting for a resource held by another transaction. This can
result in a perpetual wait state.
24. What is the difference between CHAR and VARCHAR data types?
Ans. CHAR is a fixed-length character data type that stores a specific number of
characters, while VARCHAR is a variable-length character data type that stores a
varying number of characters.
25. What is the difference between a function and a stored procedure?
Ans. A function returns a value and can be used in SQL statements, whereas a
stored
procedure does not return a value directly but can perform various actions.
26. What is the difference between GROUP BY and HAVING clauses?
Ans. GROUP BY is used to group rows based on one or more columns, while
HAVING is
used to filter grouped rows based on specific conditions.
27. What is the difference between a database and a schema?
Ans. A database is a collection of related data that is stored and organized. A
schema,
on the other hand, is a logical container within a database that holds objects like
tables, views, and procedures.
28. What is a data warehouse?
Ans. A data warehouse is a large repository of data collected from various
sources,
structured and organized to support business intelligence and reporting.
29. What is the difference between OLTP and OLAP?
Ans. OLTP (Online Transaction Processing) is used for day-to-day transactional
operations and focuses on real-time processing. OLAP (Online Analytical
Processing) is used for complex analytical queries and focuses on historical data
analysis.
30. What is a correlated subquery?
Ans. A correlated subquery is a subquery that references columns from the outer
query. It is executed for each row of the outer query, making it dependent on the
outer query's results.
31. What is the difference between a temporary table and a table variable?
Ans. A temporary table is a physical table that is created and used temporarily
within
a session or a specific scope, whereas a table variable is a variable with a
structure similar to a table and exists only within the scope of a user-defined
function or a stored procedure.
32. What is the difference between UNION and JOIN?
Ans. UNION combines rows from two or more tables vertically, while JOIN
combines
columns from two or more tables horizontally based on related columns.
33. What is the difference between WHERE and HAVING clauses?
Ans. WHERE is used to filter rows before grouping in a query, while HAVING is
used to
filter grouped rows after grouping.
34. What is the difference between a database and a data warehouse?
Ans. A database is a collection of related data organized for transactional
purposes,
while a data warehouse is a large repository of data organized for analytical
purposes.
35. What is the difference between a primary key and a candidate key?
Ans. A candidate key is a column or a set of columns that can uniquely identify
each
record in a table. A primary key is a chosen candidate key that becomes the main
identifier for the table.
36. What is the difference between a schema and a database?
Ans. A database is a collection of related data, while a schema is a logical
container
within a database that holds objects like tables, views, and procedures.
37. What is a self-join?
Ans. A self-join is a join operation where a table is joined with itself. It is used
when you
want to compare rows within the same table.
38. What is a recursive SQL query?
Ans. A recursive SQL query is a query that refers to its own output in order to
perform
additional operations. It is commonly used for hierarchical or tree-like data
structures.
39. What is the difference between a correlated subquery and a nested
subquery?
Ans. A correlated subquery is a subquery that references columns from the outer
query, while a nested subquery is a subquery that is independent of the outer
query.
40. What is the difference between a natural join and an equijoin?
Ans. A natural join is a join operation that automatically matches columns with
the
same name from both tables, whereas an equijoin is a join operation that
explicitly specifies the join condition using equality operators.
41. What is the difference between an outer join and an inner join?
Ans. An inner join returns only the matching rows from both tables, whereas an
outer
join returns all rows from one table and matching rows from the other table(s).
42. What is the difference between a left join and a right join?
Ans. A left join returns all rows from the left table and matching rows from the
right
table, whereas a right join returns all rows from the right table and matching rows
from the left table.
43. What is a full outer join?
Ans. A full outer join returns all rows from both tables, including unmatched
rows, and
combines them based on the join condition.
44. What is a self-referencing foreign key?
Ans. A self-referencing foreign key is a foreign key that references the primary key
of
the same table. It is used to establish hierarchical relationships within a single
table.
45. What is the purpose of the GROUP BY clause?
Ans. The GROUP BY clause is used to group rows based on one or more columns.
It is
typically used with aggregate functions to perform calculations on each group.
46. What is the purpose of the HAVING clause?
Ans. The HAVING clause is used to filter grouped rows based on specific
conditions. It
operates on the results of the GROUP BY clause.
47. What is the purpose of the ORDER BY clause?
Ans. The ORDER BY clause is used to sort the result set based on one or more
columns
in ascending or descending order.
48. What is the purpose of the DISTINCT keyword?
Ans. The DISTINCT keyword is used to retrieve unique values from a column in a
result
set, eliminating duplicate rows.
49. What is the purpose of the LIKE operator?
Ans. The LIKE operator is used in a WHERE clause to search for a specified pattern
in a
column. It allows wildcard characters like % (matches any sequence of
characters) and _ (matches any single character).
50. What is the purpose of the IN operator?
Ans. The IN operator is used in a WHERE clause to check if a value matches any
value in
a list or a subquery.
51.
What is the purpose of the BETWEEN operator?
Ans. The BETWEEN operator is used in a WHERE clause to check if a value lies
within a
specified range of values, inclusive of the endpoints.
52. What is the purpose of the EXISTS operator?
Ans. The EXISTS operator is used in a WHERE clause to check if a subquery returns
any
rows. It returns true if the subquery result set is not empty.
53. What is the purpose of the COUNT() function?
Ans. The COUNT() function is used to count the number of rows or non-null
values in a
column.
54. What is the purpose of the SUM() function?
Ans. The SUM() function is used to calculate the sum of values in a column.
55. What is the purpose of the AVG() function?
Ans. The AVG() function is used to calculate the average value of a column.
56. What is the purpose of the MAX() function?
Ans. The MAX() function is used to retrieve the maximum value from a column.
57. What is the purpose of the MIN() function?
Ans. The MIN() function is used to retrieve the minimum value from a column.
58. What is the purpose of the GROUP_CONCAT() function?
Ans. The GROUP_CONCAT() function is used to concatenate values from multiple
rows
into a single string, grouped by a specific column.
59. What is the purpose of the JOIN keyword?
Ans. The JOIN keyword is used to combine rows from two or more tables based
on
related columns.
60. What is a self-referencing table?
Ans. A self-referencing table is a table that has a foreign key column referencing its
own primary key. It is used to represent hierarchical relationships within a single
table.
61. What is the difference between UNION and UNION ALL?
Ans. UNION combines the result sets of two or more SELECT statements and
removes
duplicate rows, whereas UNION ALL combines the result sets without removing
duplicates.
62. What is the purpose of the ROW_NUMBER() function?
Ans. The ROW_NUMBER() function assigns a unique sequential number to each
row
within a result set. It is often used for pagination or ranking purposes.
63. What is the purpose of the RANK() function?
Ans. The RANK() function assigns a rank to each row within a result set based on a
specified criteria, such as ordering by a column. It allows you to identify the
ranking of each row.
64. What is the purpose of the DENSE_RANK() function?
Ans. The DENSE_RANK() function is similar to the RANK() function but assigns
consecutive ranks to rows without gaps. If two rows have the same rank, the next
rank is skipped.
65. What is the purpose of the LAG() function?
Ans. The LAG() function is used to access the value of a previous row within a
result set
based on a specified column. It allows you to compare values across adjacent
rows.
66. What is the purpose of the LEAD() function?
Ans. The LEAD() function is used to access the value of a subsequent row within a
result set based on a specified column. It allows you to compare values across
adjacent rows.
67. What is the purpose of the COALESCE() function?
Ans. The COALESCE() function is used to return the first non-null value from a list
of
expressions. It is often used to provide a default value when a column value is
null.
68. What is the purpose of the CASE statement?
Ans. The CASE statement is used to perform conditional logic within a SQL
statement. It
allows you to evaluate multiple conditions and return different values based on
the result.
69. What is the purpose of the TRUNCATE TABLE statement?
Ans. The TRUNCATE TABLE statement is used to remove all rows from a table,
while
keeping the table structure intact. It is faster than deleting all rows using the
DELETE statement.
70. What is the purpose of the CONSTRAINT keyword?
Ans. The CONSTRAINT keyword is used to define rules and relationships on
columns
within a table. It ensures data integrity and enforces business rules.
71. What is the purpose of the PRIMARY KEY constraint?
Ans. The PRIMARY KEY constraint is used to uniquely identify each record in a
table. It
ensures that the primary key column(s) have unique values and cannot contain
null values.
72. What is the purpose of the FOREIGN KEY constraint?
Ans. The FOREIGN KEY constraint is used to establish a relationship between two
tables
based on a common column. It ensures referential integrity by enforcing that
values in the foreign key column exist in the referenced table's primary key.
73. What is the purpose of the INDEX keyword?
Ans. The INDEX keyword is used to create an index on one or more columns of a
table. It
improves query performance by allowing faster data retrieval based on the
indexed columns.
74. What is the purpose of the CASCADE keyword in a FOREIGN KEY constraint?
Ans. The CASCADE keyword is used to specify that changes made to the primary
key
values in the referenced table should be propagated to the foreign key values in
the referring table. This ensures that the relationship remains valid.
75. What is the purpose of the UPDATE statement?
Ans. The UPDATE statement is used to modify existing records in a table. It allows
you to
change the values of one or more columns based on specified conditions.
76. What is the purpose of the DELETE statement?
Ans. The DELETE statement is used to remove one or more records from a table.
It
allows you to delete rows based on specified conditions.
77. What is the purpose of the COMMIT statement?
Ans. The COMMIT statement is used to permanently save all changes made
within a
transaction to the database. Once committed, the changes are visible to other
users.
78. What is the purpose of the ROLLBACK statement?
Ans. The ROLLBACK statement is used to undo all changes made within a
transaction
and restore the database to its previous state. It is typically used when an error
occurs or when the transaction needs to be canceled.
79. What is the purpose of the SAVEPOINT statement?
Ans. The SAVEPOINT statement is used to define a specific point within a
transaction to
which you can roll back. It allows you to undo changes up to a specific savepoint
without rolling back the entire transaction.
80. What is the purpose of the CONSTRAINT keyword in the ALTER TABLE
statement?
Ans. The CONSTRAINT keyword in the ALTER TABLE statement is used to add,
modify, or
drop constraints on columns within an existing table.
81. What is the purpose of the DISTINCT keyword in the SELECT statement?
Ans. The DISTINCT keyword in the SELECT statement is used to retrieve unique
values
from a column in the result set, eliminating duplicate rows.
82. What is the purpose of the AS keyword in the SELECT statement?
Ans. The AS keyword in the SELECT statement is used to assign an alias to a
column or
a table. It allows you to refer to the column or table by the assigned alias in
subsequent parts of the query.
83. What is the purpose of the ORDER BY clause in the SELECT statement?
Ans. The ORDER BY clause in the SELECT statement is used to sort the result set
based
on one or more columns in ascending or descending order.
84. What is the purpose of the GROUP BY clause in the SELECT statement?
Ans. The GROUP BY clause in the SELECT statement is used to group rows based
on one
or more columns. It is typically used with aggregate functions to perform
calculations on each group.
85. What is the purpose of the HAVING clause in the SELECT statement?
Ans. The HAVING clause in the SELECT statement is used to filter grouped rows
based
on specific conditions. It operates on the results of the GROUP BY clause.
86. What is the purpose of the LIMIT clause in the SELECT statement?
Ans. The LIMIT clause in the SELECT statement is used to restrict the number of
rows
returned by a query. It allows you to specify the maximum number of rows to be
retrieved.
87. What is the purpose of the OFFSET clause in the SELECT statement?
Ans. The OFFSET clause in the SELECT statement is used in conjunction with the
LIMIT
clause to skip a specified number of rows before starting to return the result set.
88. What is the purpose of the JOIN keyword in the SELECT statement?
Ans. The JOIN keyword in the SELECT statement is used to combine rows from
two or
more tables based on related columns. It allows you to retrieve data from multiple
tables in a single query.
89. What is the purpose of the INNER JOIN?
Ans. The INNER JOIN is a join operation that returns only the matching rows from
both
tables based on the specified join condition. It combines rows that have matching
values in the joined columns.
90. What is the purpose of the LEFT JOIN?
Ans. The LEFT JOIN is a join operation that returns all rows from the left table and
the
matching rows from the right table based on the specified join condition. If no
match is found, null values are returned for the right table columns.
91. What is the purpose of the RIGHT JOIN?
Ans. The RIGHT JOIN is a join operation that returns all rows from the right table
and the
matching rows from the left table based on the specified join condition. If no
match is found, null values are returned for the left table columns.
92. What is the purpose of the FULL OUTER JOIN?
Ans. The FULL OUTER JOIN is a join operation that returns all rows from both
tables,
including unmatched rows, and combines them based on the join condition. If no
match is found, null values are returned for the respective columns.
93. What is the purpose of the UNION operator?
Ans. The UNION operator is used to combine the result sets of two or more
SELECT
statements into a single result set. It removes duplicate rows from the final result
set.
94. What is the purpose of the UNION ALL operator?
Ans. The UNION ALL operator is used to combine the result sets of two or more
SELECT
statements into a single result set, including duplicate rows.
95. What is the purpose of the LIKE operator in the WHERE clause?
Ans. The LIKE operator is used in the WHERE clause to search for a specified
pattern in a
column. It allows wildcard characters like % (matches any sequence of
characters) and _ (matches any single character).
96. What is the purpose of the IN operator in the WHERE clause?
Ans. The IN operator is used in the WHERE clause to check if a value matches any
value
in a list or a subquery.
97. What is the purpose of the EXISTS operator in the WHERE clause?
Ans. The EXISTS operator is used in the WHERE clause to check if a subquery
returns any
rows. It returns true if the subquery result set is not empty.
98. What is the purpose of the GROUP BY clause in the SELECT statement?
Ans. The GROUP BY clause in the SELECT statement is used to group rows based
on one
or more columns. It is typically used with aggregate functions to perform
calculations on each group.
99. What is the purpose of the ORDER BY clause in the SELECT statement?
Ans. The ORDER BY clause in the SELECT statement is used to sort the result set
based
on one or more columns in ascending or descending order.
100. What is the purpose of the DISTINCT keyword in the SELECT statement? –
Ans. The DISTINCT keyword in the SELECT statement is used to retrieve unique
values
from a column in the result set, eliminating duplicate rows.
SQL Query Based Question
1. Select all employees whose salary is greater than 50,000.
Ans. SELECT *
FROM employees
WHERE salary > 50000;
2. Find employees who joined in the last 30 days.
Ans. SELECT *
FROM employees
WHERE joining_date >= CURRENT_DATE - INTERVAL '30 days';
3. Retrieve distinct department names from the employee table.
Ans. SELECT DISTINCT department
FROM employees;
4. Sort employees by salary in descending order.
Ans. SELECT *
FROM employees
ORDER BY salary DESC;
5. Count the number of employees in each department.
Ans. SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
6. Find employees whose name starts with 'A'.
Ans. SELECT *
FROM employees
WHERE name LIKE 'A%';
7. Retrieve top 3 highest-paid employees.
Ans. SELECT *
FROM employees
ORDER BY salary DESC
LIMIT 3;
Ans. SELECT *
FROM employees
WHERE manager_id IS NULL;
8. Write a query to find NULL values in a column.
Ans. SELECT *
FROM employees
WHERE manager_id IS NULL;9. Display employees who don't have a manager
assigned.
10. Difference between WHERE and HAVING? Write queries for both.
Ans. WHERE: filters rows before grouping.
HAVING: filters groups after aggregation.
WHERE example (employees with salary > 50,000)
SELECT *
FROM employees
WHERE salary > 50000;
HAVING example (departments with more than 10 employees)
SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
11. Get employees along with their department names
(Assume: employees.department_id → departments.department_id)
SELECT e.employee_id, [Link], d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;
12. Find employees who don't belong to any department
SELECT e.*
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_id IS NULL;
13. List all departments and the number of employees in each
SELECT d.department_name, COUNT(e.employee_id) AS employee_count
FROM departments d
LEFT JOIN employees e
ON d.department_id = e.department_id
GROUP BY d.department_name;
14. Get employees and their manager's name (self join)
(Assume manager_id in employees refers to employee_id of manager)
SELECT e.employee_id, [Link] AS employee_name,
[Link] AS manager_name
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id;
15. Find customers who placed orders in the last 7 days
SELECT DISTINCT c.customer_id, c.customer_name
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '7 days';
16. Fetch employees working in more than one project
(Assume a mapping table employee_projects(employee_id, project_id))
SELECT employee_id
FROM employee_projects
GROUP BY employee_id
HAVING COUNT(DISTINCT project_id) > 1;
17. Difference between INNER JOIN and LEFT JOIN
• INNER JOIN → returns only matching rows.
• LEFT JOIN → returns all rows from left table + matching rows from right
table (NULL if no match).
Example:
-- INNER JOIN: only employees who have a department
SELECT [Link], d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;
-- LEFT JOIN: all employees, even if they don’t belong to any department
SELECT [Link], d.department_name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id;
18. Find products ordered by more than 5 unique customers
SELECT p.product_id, p.product_name
FROM orders o
JOIN products p
ON o.product_id = p.product_id
GROUP BY p.product_id, p.product_name
HAVING COUNT(DISTINCT o.customer_id) > 5;
19. Retrieve employees who never made a sale (anti join)
(Assume sales(employee_id, sale_id))
SELECT e.*
FROM employees e
LEFT JOIN sales s
ON e.employee_id = s.employee_id
WHERE s.sale_id IS NULL;
20. Query using CROSS JOIN
(Creates all combinations — Cartesian product)
SELECT [Link] AS employee, p.project_name
FROM employees e
CROSS JOIN projects p;
21. Find the second highest salary
SELECT MAX(salary) AS second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
(Alternate using window functions:)
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
OFFSET 1 ROW
FETCH NEXT 1 ROW ONLY;
22. Get Nth highest salary (N=3)
SELECT DISTINCT salary
FROM employees e1
WHERE 3 - 1 = (
SELECT COUNT(DISTINCT salary)
FROM employees e2
WHERE [Link] > [Link]
);
(Window function way:)
SELECT salary
FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
) ranked
WHERE rnk = 3;
23. List departments where average salary > 60,000
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 60000;
24. Find employees earning more than department average
SELECT e.employee_id, [Link], [Link], e.department_id
FROM employees e
WHERE [Link] > (
SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id
);
25. Calculate running total of sales
(Assume sales(sale_id, amount, sale_date))
SELECT sale_id, sale_date, amount,
SUM(amount) OVER (ORDER BY sale_date) AS running_total
FROM sales;
26. Find employees whose salary is above company average
SELECT *
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
27. Get employees with the same salary
SELECT e1.employee_id, [Link], [Link]
FROM employees e1
JOIN employees e2
ON [Link] = [Link] AND e1.employee_id <> e2.employee_id
ORDER BY [Link];
(Or grouped:)
SELECT salary, COUNT(*) AS num_employees
FROM employees
GROUP BY salary
HAVING COUNT(*) > 1;
28. Find employees who earn more than their manager
SELECT e.employee_id, [Link], [Link], [Link] AS manager_name, [Link] AS
manager_salary
FROM employees e
JOIN employees m
ON e.manager_id = m.employee_id
WHERE [Link] > [Link];
29. Difference between max and min salary in each department
SELECT department_id,
MAX(salary) - MIN(salary) AS salary_difference
FROM employees
GROUP BY department_id;
30. Find employees whose salary is in top 10% overall
(Window function way:)
SELECT employee_id, name, salary
FROM (
SELECT e.*,
PERCENT_RANK() OVER (ORDER BY salary DESC) AS perc_rank
FROM employees e
) ranked
WHERE perc_rank <= 0.10;
31. Rank employees by salary within each department
SELECT employee_id, name, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS
dept_rank
FROM employees;
32. Find the first and last order date of each customer
SELECT customer_id,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order
FROM orders
GROUP BY customer_id;
33. Calculate 7-day moving average of sales
(Assume sales(sale_date, amount))
SELECT sale_date, amount,
AVG(amount) OVER (ORDER BY sale_date ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) AS moving_avg_7d
FROM sales;
34. Find employees whose salary decreased compared to last year
(Assume employee_salaries(employee_id, year, salary))
SELECT e1.employee_id, [Link], [Link] AS current_salary, [Link] AS
last_year_salary
FROM employee_salaries e1
JOIN employee_salaries e2
ON e1.employee_id = e2.employee_id AND [Link] = [Link] + 1
WHERE [Link] < [Link];
35. Identify customers with 3 consecutive failed transactions
(Assume transactions(customer_id, status, txn_date) where status = 'FAILED')
SELECT DISTINCT customer_id
FROM (
SELECT customer_id, status,
LAG(status,1) OVER (PARTITION BY customer_id ORDER BY txn_date) AS
prev1,
LAG(status,2) OVER (PARTITION BY customer_id ORDER BY txn_date) AS
prev2
FROM transactions
)t
WHERE status='FAILED' AND prev1='FAILED' AND prev2='FAILED';
36. Get the previous and next salary for each employee (LAG & LEAD)
SELECT employee_id, name, salary,
LAG(salary) OVER (ORDER BY salary) AS prev_salary,
LEAD(salary) OVER (ORDER BY salary) AS next_salary
FROM employees;
37. Calculate retention: users who logged in on D1, D7, and D30
(Assume logins(user_id, login_date) and D1 = signup_date)
SELECT user_id
FROM logins
WHERE login_date = signup_date
INTERSECT
SELECT user_id
FROM logins
WHERE login_date = signup_date + INTERVAL '7 days'
INTERSECT
SELECT user_id
FROM logins
WHERE login_date = signup_date + INTERVAL '30 days';
38. Find running average of employee salaries
SELECT employee_id, name, salary,
AVG(salary) OVER (ORDER BY employee_id ROWS BETWEEN UNBOUNDED
PRECEDING AND CURRENT ROW) AS running_avg
FROM employees;
39. Find employees with highest salary in each department using window
functions (not subqueries)
SELECT employee_id, name, department_id, salary
FROM (
SELECT e.*,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk
FROM employees e
) ranked
WHERE rnk = 1;
40. Show RANK vs DENSE_RANK difference with example
RANK() → leaves gaps when there are ties.
DENSE_RANK() → no gaps, ranks are continuous.
Example:
SELECT name, salary,
RANK() OVER (ORDER BY salary DESC) AS rank_example,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank_example
FROM employees;
If salaries are [90k, 85k, 85k, 80k]
• RANK() → 1, 2, 2, 4
• DENSE_RANK() → 1, 2, 2, 3
41. Calculate churn rate of customers
(Churn = customers inactive in a period ÷ total customers at start)
-- Customers active last month but not this month
SELECT
(COUNT(DISTINCT CASE WHEN last_month.login_date IS NOT NULL AND
this_month.login_date IS NULL THEN last_month.user_id END)::DECIMAL
/ COUNT(DISTINCT last_month.user_id)) * 100 AS churn_rate
FROM (
SELECT DISTINCT user_id FROM logins WHERE login_date BETWEEN
DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND DATE_TRUNC('month', CURRENT_DATE) -
INTERVAL '1 day'
) last_month
LEFT JOIN (
SELECT DISTINCT user_id FROM logins WHERE login_date >=
DATE_TRUNC('month', CURRENT_DATE)
) this_month
ON last_month.user_id = this_month.user_id;
42. Build a funnel: impression → click → add-to-cart → purchase
(Assume events(user_id, event_type, event_time))
SELECT
COUNT(DISTINCT CASE WHEN event_type='impression' THEN user_id END) AS
impressions,
COUNT(DISTINCT CASE WHEN event_type='click' THEN user_id END) AS clicks,
COUNT(DISTINCT CASE WHEN event_type='add_to_cart' THEN user_id END) AS
add_to_cart,
COUNT(DISTINCT CASE WHEN event_type='purchase' THEN user_id END) AS
purchases
FROM events;
43. Calculate conversion rate at each funnel stage
WITH funnel AS (
SELECT
COUNT(DISTINCT CASE WHEN event_type='impression' THEN user_id END) AS
impressions,
COUNT(DISTINCT CASE WHEN event_type='click' THEN user_id END) AS clicks,
COUNT(DISTINCT CASE WHEN event_type='add_to_cart' THEN user_id END)
AS add_to_cart,
COUNT(DISTINCT CASE WHEN event_type='purchase' THEN user_id END) AS
purchases
FROM events
)
SELECT *,
(clicks::DECIMAL / impressions) * 100 AS click_rate,
(add_to_cart::DECIMAL / clicks) * 100 AS add_to_cart_rate,
(purchases::DECIMAL / add_to_cart) * 100 AS purchase_rate
FROM funnel;
44. Find repeat purchase rate of customers
SELECT
(COUNT(DISTINCT CASE WHEN purchase_count > 1 THEN customer_id
END)::DECIMAL
/ COUNT(DISTINCT customer_id)) * 100 AS repeat_purchase_rate
FROM (
SELECT customer_id, COUNT(*) AS purchase_count
FROM orders
GROUP BY customer_id
) t;
45. Identify power users (5+ logins per week)
SELECT user_id, DATE_TRUNC('week', login_date) AS week, COUNT(*) AS
login_count
FROM logins
GROUP BY user_id, DATE_TRUNC('week', login_date)
HAVING COUNT(*) >= 5;
46. Find inactive users in the last 60 days
SELECT user_id
FROM users
WHERE user_id NOT IN (
SELECT DISTINCT user_id
FROM logins
WHERE login_date >= CURRENT_DATE - INTERVAL '60 days'
);
47. Calculate Customer Lifetime Value (CLV)
(Revenue per customer × Avg lifetime)
SELECT customer_id, SUM(amount) AS total_revenue
FROM orders
GROUP BY customer_id;
In interviews, you may extend with avg lifespan, churn, margin.
48. Find average time between two purchases per user
SELECT customer_id, AVG(next_order_date - order_date) AS avg_gap
FROM (
SELECT customer_id, order_date,
LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date)
AS next_order_date
FROM orders
)t
WHERE next_order_date IS NOT NULL
GROUP BY customer_id;
49. Get drop-off rate between two funnel steps
Example: Click → Add to Cart
WITH funnel AS (
SELECT
COUNT(DISTINCT CASE WHEN event_type='click' THEN user_id END) AS clicks,
COUNT(DISTINCT CASE WHEN event_type='add_to_cart' THEN user_id END)
AS add_to_cart
FROM events
)
SELECT ( (clicks - add_to_cart)::DECIMAL / clicks ) * 100 AS drop_off_rate
FROM funnel;
50. Identify users who upgraded from free → paid plan
(Assume subscriptions(user_id, plan, start_date))
SELECT DISTINCT user_id
FROM (
SELECT user_id, plan,
LAG(plan) OVER (PARTITION BY user_id ORDER BY start_date) AS prev_plan
FROM subscriptions
)t
WHERE prev_plan='free' AND plan='paid';