Comprehensive guide to important Points and tricky conceptual issues in SQL

Here’s a comprehensive guide to important and tricky conceptual issues in SQL, including NULL behavior, joins, filters, grouping, ordering, and subqueries.

✅ 1. NULLs: The #1 source of confusion

a. `NULL` ≠ `NULL`

NULL = NULL → UNKNOWN (not TRUE)
Must use: IS NULL, IS NOT NULL

b. `NOT IN` with NULL

If subquery has a single NULL, the whole NOT IN fails.
✅ Use LEFT ANTI JOIN instead.

c. Arithmetic with NULL

5 + NULL = NULL
Any arithmetic with NULL gives NULL

✅ 2. JOIN Issues

a. `INNER JOIN` drops unmatched rows.

b. `LEFT JOIN` keeps all rows from the left, even unmatched.

Add WHERE right_col IS NULL to find “non-matching” entries.

c. Beware of duplicate matches

If join keys aren’t unique, you’ll get row explosion.
Use aggregation or row number if necessary.

✅ 3. WHERE vs HAVING

a. `WHERE` filters before aggregation

b. `HAVING` filters after aggregation

SELECT department, COUNT(*) as emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;

✅ 4. GROUP BY Pitfalls

a. Only select aggregated columns or grouped columns

SELECT dept, COUNT(*) FROM employees GROUP BY dept; ✅
SELECT name, COUNT(*) FROM employees GROUP BY dept; ❌

✅ 5. ORDER BY and LIMIT

a. `ORDER BY` applies after SELECT and GROUP BY

Often used with LIMIT to find top N

SELECT * FROM sales ORDER BY revenue DESC LIMIT 5;

✅ 6. Subquery Issues

a. `NOT IN (SELECT NULL...)` ⇒ fails

b. Correlated subqueries run per row

SELECT e.*
FROM employees e
WHERE salary > (
  SELECT AVG(salary) FROM employees WHERE department = e.department
);

⚠️ Can be slow if not optimized

✅ 7. DISTINCT vs GROUP BY

a. `DISTINCT` removes duplicate rows

SELECT DISTINCT dept FROM employees;

b. `GROUP BY` groups rows and aggregates

SELECT dept, COUNT(*) FROM employees GROUP BY dept;

✅ 8. BETWEEN is inclusive

WHERE date BETWEEN '2023-01-01' AND '2023-01-31'

Includes both '2023-01-01' and '2023-01-31'

✅ 9. NULL-safe Equality (MySQL specific)

SELECT * FROM t WHERE a <=> b;  -- True even if both NULL

Not supported in most engines; use with caution.

✅ 10. CASE and COALESCE

a. `CASE` for conditional logic

SELECT
  CASE
    WHEN score > 90 THEN 'A'
    WHEN score > 70 THEN 'B'
    ELSE 'C'
  END as grade

b. `COALESCE()` returns first non-NULL

SELECT COALESCE(nickname, name, 'N/A') FROM users;

✅ 11. Window Functions (OVER clause)

Powerful for rankings, moving averages, cumulative sums:

SELECT name, salary,
  RANK() OVER (PARTITION BY dept ORDER BY salary DESC) as dept_rank
FROM employees;

✅ 12. UNION vs UNION ALL

UNION removes duplicates
UNION ALL keeps all rows (faster)

✅ 13. Cartesian Product (Cross Join by mistake)

If you forget ON clause in join, you get:

SELECT * FROM A, B;  -- Big disaster!

✅ 14. Filtering on Aggregates: Don’t put aggregates in WHERE

SELECT dept, AVG(salary)
FROM employees
WHERE AVG(salary) > 50000;  ❌ Invalid

✅ Use HAVING

✅ 15. Boolean logic (3-valued)

TRUE, FALSE, and UNKNOWN
Comparisons with NULL yield UNKNOWN
WHERE filters only TRUE

✅ 16. Common performance tips

Use indexed columns in joins/filters
Avoid SELECT * in production
Filter early (pushdown)
Watch out for implicit type conversion (slows joins)

✅ 17. Date functions differ by dialect

Spark SQL: date_sub(), current_date()
MySQL: DATE_SUB(NOW(), INTERVAL 7 DAY)
Standard SQL: CURRENT_DATE - INTERVAL '7 days'

HintsToday

recent posts

about

Comprehensive guide to important Points and tricky conceptual issues in SQL

✅ 1. NULLs: The #1 source of confusion

a. `NULL` ≠ `NULL`

b. `NOT IN` with NULL

c. Arithmetic with NULL

✅ 2. JOIN Issues

a. `INNER JOIN` drops unmatched rows.

b. `LEFT JOIN` keeps all rows from the left, even unmatched.

c. Beware of duplicate matches

✅ 3. WHERE vs HAVING

a. `WHERE` filters before aggregation

b. `HAVING` filters after aggregation

✅ 4. GROUP BY Pitfalls

a. Only select aggregated columns or grouped columns

✅ 5. ORDER BY and LIMIT

a. `ORDER BY` applies after SELECT and GROUP BY

✅ 6. Subquery Issues

a. `NOT IN (SELECT NULL...)` ⇒ fails

b. Correlated subqueries run per row

✅ 7. DISTINCT vs GROUP BY

a. `DISTINCT` removes duplicate rows

b. `GROUP BY` groups rows and aggregates

✅ 8. BETWEEN is inclusive

✅ 9. NULL-safe Equality (MySQL specific)

✅ 10. CASE and COALESCE

a. `CASE` for conditional logic

b. `COALESCE()` returns first non-NULL

✅ 11. Window Functions (OVER clause)

✅ 12. UNION vs UNION ALL

✅ 13. Cartesian Product (Cross Join by mistake)

✅ 14. Filtering on Aggregates: Don’t put aggregates in WHERE

✅ 15. Boolean logic (3-valued)

✅ 16. Common performance tips

✅ 17. Date functions differ by dialect

Like this:

Discover more from HintsToday

Leave a ReplyCancel reply

recent posts

about

Comprehensive guide to important Points and tricky conceptual issues in SQL

✅ 1. NULLs: The #1 source of confusion

a. NULL ≠ NULL

b. NOT IN with NULL

c. Arithmetic with NULL

✅ 2. JOIN Issues

a. INNER JOIN drops unmatched rows.

b. LEFT JOIN keeps all rows from the left, even unmatched.

c. Beware of duplicate matches

✅ 3. WHERE vs HAVING

a. WHERE filters before aggregation

b. HAVING filters after aggregation

✅ 4. GROUP BY Pitfalls

a. Only select aggregated columns or grouped columns

✅ 5. ORDER BY and LIMIT

a. ORDER BY applies after SELECT and GROUP BY

✅ 6. Subquery Issues

a. NOT IN (SELECT NULL...) ⇒ fails

b. Correlated subqueries run per row

✅ 7. DISTINCT vs GROUP BY

a. DISTINCT removes duplicate rows

b. GROUP BY groups rows and aggregates

✅ 8. BETWEEN is inclusive

✅ 9. NULL-safe Equality (MySQL specific)

✅ 10. CASE and COALESCE

a. CASE for conditional logic

b. COALESCE() returns first non-NULL

✅ 11. Window Functions (OVER clause)

✅ 12. UNION vs UNION ALL

✅ 13. Cartesian Product (Cross Join by mistake)

✅ 14. Filtering on Aggregates: Don’t put aggregates in WHERE

✅ 15. Boolean logic (3-valued)

✅ 16. Common performance tips

✅ 17. Date functions differ by dialect

Like this:

Discover more from HintsToday

Leave a ReplyCancel reply

Discover more from HintsToday

a. `NULL` ≠ `NULL`

b. `NOT IN` with NULL

a. `INNER JOIN` drops unmatched rows.

b. `LEFT JOIN` keeps all rows from the left, even unmatched.

a. `WHERE` filters before aggregation

b. `HAVING` filters after aggregation

a. `ORDER BY` applies after SELECT and GROUP BY

a. `NOT IN (SELECT NULL...)` ⇒ fails

a. `DISTINCT` removes duplicate rows

b. `GROUP BY` groups rows and aggregates

a. `CASE` for conditional logic

b. `COALESCE()` returns first non-NULL