Understanding how an SQL query executes in a database is essential for performance tuning and system design. Here’s a step-by-step breakdown of what happens under the hood when you run an SQL query like:

SELECT name FROM employees WHERE department = 'Sales';

๐Ÿงญ 0. Query Input (Your SQL)

You submit the SQL query via:

  • SQL CLI / GUI (like pgAdmin, MySQL Workbench)
  • Application backend
  • Notebook (e.g., Databricks, Jupyter)

โš™๏ธ Step-by-Step SQL Query Execution

๐Ÿงฉ Step 1: Parsing

  • The SQL Parser checks for syntax errors.
  • Converts SQL into an abstract syntax tree (AST).
  • Example: Confirms SELECT, FROM, WHERE are valid clauses.

โœ… If there’s a syntax error โ†’ You get an error immediately.


๐Ÿ“Š Step 2: Query Rewrite (Optional)

  • The query rewrite engine transforms the SQL to an equivalent but more efficient version.
  • Includes:
    • View expansion (resolving a view into its base query)
    • Subquery flattening
    • Predicate pushdown
    • Common subexpression elimination

Example:

SELECT * FROM view_emp WHERE dept = 'HR';

โ†’ gets rewritten to underlying table query.


๐Ÿง  Step 3: Query Optimization

  • The query optimizer generates multiple query plans and picks the least-cost plan using:
    • Table statistics (row count, distinct values)
    • Index availability
    • Join order and type
    • Available keys, partitions

Example decision:

Should I scan the whole table or use an index on department?

โœ… Outputs an execution plan (physical plan).


๐Ÿ› ๏ธ Step 4: Query Plan Execution (Physical Operators)

The engine executes the physical plan with:

  • Scans (full table or index)
  • Filters (apply WHERE conditions)
  • Joins, Aggregations
  • Sorts, Grouping

All operators work in a pipeline, usually row-by-row or batch-by-batch.


๐Ÿ’พ Step 5: Data Fetch / Buffer / Storage Layer

  • Fetches actual data blocks/pages from:
    • Memory (buffer pool / cache)
    • Disk (if not cached)
  • Applies filtering logic (e.g., department = 'Sales')

๐Ÿ” Uses indexes (if available) to avoid full scan.


๐Ÿ“ค Step 6: Results Return

  • The output rows are passed back:
    • Through client interface (ODBC, JDBC)
    • Shown in your query editor / app
  • May be paginated or streamed depending on client

๐Ÿ—‚๏ธ Summary Flowchart

SQL Query 
   โ†“
1. Parsing โ†’ Syntax Tree
   โ†“
2. Rewrite (Views, Subqueries)
   โ†“
3. Optimization (Plan + Cost Model)
   โ†“
4. Execution Plan (Scan, Filter, Join, Aggregate)
   โ†“
5. Fetch Data from Memory/Disk
   โ†“
6. Return Result to User

๐Ÿ” Example: Optimized Plan for This Query

SELECT name FROM employees WHERE department = 'Sales';
  • Use index on department (if exists)
  • Perform index seek to locate matching rows
  • Retrieve name column using a covering index (if name is in index)
  • Avoid full scan

๐Ÿงช Tools to View Execution Plan

DB EngineCommand
PostgreSQLEXPLAIN ANALYZE SELECT ...
MySQLEXPLAIN SELECT ...
SQL ServerUse “Actual Execution Plan”
Databricks.explain(True) or Visual Plan

๐Ÿง  Bonus: What Affects Execution?

  • Indexes and stats
  • Query complexity (joins, subqueries)
  • Table size and partitioning
  • Caching
  • Database engine (PostgreSQL vs MySQL vs Spark SQL)


Discover more from HintsToday

Subscribe to get the latest posts sent to your email.