We should combine both perspectives—the logical flow (SQL-level) and the system-level architecture (engine internals)—into a comprehensive, step-by-step guide on how SQL queries execute in a database, using a real query example.
🧠 How a SQL Query Executes (Combined Explanation)
✅ Example Query:
SELECT name FROM employees WHERE department = 'Sales';
This query goes through the following four high-level stages, each containing deeper substeps.
🚦 STEP 1: Client → Transport Subsystem
This is the entry point of the query.
✅ What happens:
- The query hits the Transport Layer (part of DB engine)
- Performs:
- ✅ Connection management
- ✅ Authentication (who are you?)
- ✅ Authorization (can you run this?)
- If valid, the query moves to the Query Processor
🧠 STEP 2: Query Processor Subsystem
This stage builds and optimizes a plan for your query.
2.1 🔍 Query Parsing
- SQL query is converted to a Parse Tree (syntax tree)
- Checks for:
- ❌ Syntax errors
- Tokenizes clauses:
SELECT
,FROM
,WHERE
, etc.
2.2 🔬 Semantic Analysis
- Checks for:
- ❌ Invalid tables/columns (e.g., does
employees
exist?) - ❌ Data type mismatches
- ❌ Invalid tables/columns (e.g., does
2.3 ⚙️ Query Rewriting
- Converts views to base queries
- Flattens subqueries
- Pushes filters closer to data (predicate pushdown)
2.4 🧠 Query Optimization
- Multiple execution plans are generated
- Chooses the cheapest plan using:
- Table stats
- Indexes
- Join orders/types
- Partitioning
- Output = Execution Plan (Physical Plan)
Example: Will it use an index on
department
? Will it scan or seek?
⚙️ STEP 3: Execution Engine
Takes the optimized plan and begins executing step-by-step.
✅ Key Responsibilities:
- Orchestrates operations: Scan → Filter → Project
- Coordinates with the Storage Engine
- Executes operations like:
Table Scan
,Index Seek
Join
,Sort
,Group
- Builds the result row by row / batch by batch
💾 STEP 4: Storage Engine
Handles the low-level read/write and memory/disk operations.
Core Components:
Component | Role |
---|---|
🔐 Lock Manager | Ensures transactional consistency (ACID) |
🧠 Buffer Manager | Loads pages from disk to memory |
🔄 Transaction Manager | Manages BEGIN, COMMIT, ROLLBACK |
💥 Recovery Manager | Handles crash recovery / logs |
📦 File & Index Manager | Reads data blocks / index pages |
Example:
- Checks if required pages are already in memory
- If not, fetches them from disk
- Applies filters (
department = 'Sales'
) - Returns only
name
column values
📤 STEP 5: Result Returned to Client
- Results from the execution engine are:
- Formatted
- Paginated / batched
- Sent over network to the client
🔁 Flow Recap (Combined View)
Client → Transport Layer → Query Processor (Parse → Optimize → Plan)
→ Execution Engine (Coordinate Steps)
→ Storage Engine (Fetch Data, Apply Filters)
→ Execution Engine (Aggregate/Join)
→ Return Result to Client
✅ Real World Example: What Happens for This Query?
SELECT name FROM employees WHERE department = 'Sales';
- Transport Layer:
- Authenticates the user
- Query Processor:
- Parses: Identifies SELECT and WHERE clauses
- Optimizer checks: Is there an index on
department
?
- Execution Engine:
- Calls Storage Engine: “Get rows where
department = 'Sales'
”
- Calls Storage Engine: “Get rows where
- Storage Engine:
- Finds index → Uses Index Seek
- Pulls only
name
from selected rows
- Execution Engine:
- Streams
name
values back to client
- Streams
💡 Bonus: How to See This in Practice
PostgreSQL:
EXPLAIN ANALYZE SELECT name FROM employees WHERE department = 'Sales';
MySQL:
EXPLAIN SELECT name FROM employees WHERE department = 'Sales';
Databricks/Spark:
df.filter("department = 'Sales'").select("name").explain(True)
🧠 Summary Table
Phase | Key Tasks | Tools/Concepts |
---|---|---|
Transport | Connection, Auth, Authorization | SSL, IAM, AAD, JDBC |
Query Processing | Parse → Rewrite → Optimize | Parse Tree, Execution Plan, Cost |
Execution Engine | Run Physical Plan, Call Storage Engine | Operators, Pipelines |
Storage Engine | Fetch/Write Pages, Transaction Control | Buffer Pool, Indexes, ACID |
Output | Return results to client | Pagination, Serialization |
Leave a Reply