PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/System Design/Retell

Design and Implement a Mini SQL Query Engine

Last updated: Jul 1, 2026

Quick Overview

This question evaluates a candidate's ability to design and build a small SQL query engine, covering tokenization, parsing, and query execution over in-memory tables. It tests system design and compiler-style thinking, including how to scope a large problem into an extensible v1 and reason about extending it with sorting, aggregation, and joins. This is a system design and practical implementation question assessing both architectural judgment and coding ability.

  • hard
  • Retell
  • System Design
  • Software Engineer

Design and Implement a Mini SQL Query Engine

Company: Retell

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

You're given a set of in-memory tables — each table is a collection of rows over a known, typed schema (e.g., a table is a list of row objects, and each column has a type such as integer, string, or null). Design and implement a small **SQL query engine**: it takes a SQL query string, executes it against the in-memory tables, and returns the result set. Start from a basic subset of SQL and be ready to extend it as the interviewer adds features. You cannot implement all of SQL in an interview window, so an explicit part of this exercise is **scoping**: decide which subset to support first, state your boundaries out loud, confirm them with the interviewer, and design so that new clauses slot in cleanly rather than forcing a rewrite. ### Constraints & Assumptions - Tables are fully in memory (e.g., `tables[name] = [ {col: value, ...}, ... ]`) with a known schema (column names + types). Data fits in memory. - Read-only queries (`SELECT`); no transactions, no persistence, no writes in the base version. - **v1 target:** `SELECT <columns> FROM <table> WHERE <predicate>`, where the predicate supports comparison operators (`=`, `!=`, `<`, `<=`, `>`, `>=`) combined with `AND` / `OR`. - **Extensions to be ready for:** `ORDER BY` + `LIMIT`, aggregate functions (`COUNT`, `SUM`, `AVG`, `MIN`, `MAX`) with `GROUP BY`, and `INNER JOIN` of two tables. - Keywords may be treated case-insensitively; assume a simplified, well-formed grammar unless validation is explicitly requested. ### Clarifying Questions to Ask - Which subset is in scope for v1, and how far do we extend — joins, grouping, subqueries? Where should I stop? - What column types exist, and how are `NULL`s handled in comparisons and in aggregates (SQL three-valued logic, or a simplified rule)? - Should I validate the query against the schema (error on unknown table/column), or assume valid input? - Is the input always a single statement? Can I assume a simplified grammar (no nested expressions in the select list to start)? - What is the expected output shape — an ordered list of rows with a defined column order — and how are `ORDER BY` ties broken? ### Part 1 — Architecture: how the engine is structured Before writing code, lay out the engine's stages and the boundaries between them. What are the components, what does each one consume and produce, and where would future features (JOIN, GROUP BY) plug in? ```hint Pipeline Model it as a classic query pipeline: **tokenizer (lexer) → parser → AST / structured query → executor** over the in-memory tables. Each stage has a narrow contract, so adding a clause means extending the grammar and adding one execution step — not rewriting everything. ``` ```hint Make it extensible Represent execution as **composable operators** — `Scan`, `Filter`, `Project`, `Join`, `Aggregate`, `Sort`, `Limit` — assembled into a tree (the "volcano"/iterator model). New SQL features become new operators, which is exactly how you absorb the interviewer's escalating requirements. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 2 — Implement the core: SELECT … FROM … WHERE Implement the base engine: parse and execute `SELECT col1, col2 FROM table WHERE <predicate>` over the in-memory tables, supporting the comparison operators and `AND` / `OR` in the predicate. Show both the parsing and the execution. ```hint Parsing A hand-written **recursive-descent parser** is plenty. Parse the clause keywords (`SELECT` / `FROM` / `WHERE`) into a structured query object, and parse the `WHERE` expression into a small **predicate tree**: comparison leaves, `AND`/`OR` internal nodes. Mind operator precedence — `AND` binds tighter than `OR`. ``` ```hint Execution Scan the table's rows, evaluate the predicate tree against each row to filter, then project the requested columns. Keep predicate evaluation a single recursive function over the expression tree. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 3 — Extend: ORDER BY, aggregation/GROUP BY, and JOIN The interviewer keeps adding features. Extend the engine to support `ORDER BY` + `LIMIT`, aggregate functions with `GROUP BY`, and an `INNER JOIN` of two tables. Explain how each maps onto your operator model, and where you would set boundaries given limited time. ```hint Map clauses to operators `ORDER BY` → a `Sort` operator on the projected rows; `LIMIT` → a `Limit` operator; `GROUP BY` + aggregates → an `Aggregate` operator that buckets rows by the group key and folds each bucket; `INNER JOIN` → a `Join` operator (nested-loop to start; hash-join if asked for performance). ``` ```hint Ordering of operations Respect SQL's **logical evaluation order**: `FROM`/`JOIN` → `WHERE` → `GROUP BY` → aggregate → (`HAVING`) → `SELECT`/project → `ORDER BY` → `LIMIT`. Getting this order right is what makes `GROUP BY` combined with `ORDER BY` behave correctly. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### What a Strong Answer Covers ```premium-lock What a Strong Answer Covers ``` ### Follow-up Questions - How would you support a `JOIN` of three or more tables, and how would join order affect performance? - How would you add `HAVING` and subqueries (e.g., `WHERE col IN (SELECT ...)`) to your AST and executor? - Where would a query optimizer fit, and what is one rewrite (e.g., predicate pushdown) that would help? - How would you test this engine so you stay confident it's correct as you keep adding features?

Quick Answer: This question evaluates a candidate's ability to design and build a small SQL query engine, covering tokenization, parsing, and query execution over in-memory tables. It tests system design and compiler-style thinking, including how to scope a large problem into an extensible v1 and reason about extending it with sorting, aggregation, and joins. This is a system design and practical implementation question assessing both architectural judgment and coding ability.

Related Interview Questions

  • Design a Cloud Call-Center Platform for Programmatic Outbound Calls - Retell (hard)
|Home/System Design/Retell

Design and Implement a Mini SQL Query Engine

Retell logo
Retell
Jan 10, 2026, 12:00 AM
hardSoftware EngineerTechnical ScreenSystem Design
0
0

You're given a set of in-memory tables — each table is a collection of rows over a known, typed schema (e.g., a table is a list of row objects, and each column has a type such as integer, string, or null). Design and implement a small SQL query engine: it takes a SQL query string, executes it against the in-memory tables, and returns the result set.

Start from a basic subset of SQL and be ready to extend it as the interviewer adds features. You cannot implement all of SQL in an interview window, so an explicit part of this exercise is scoping: decide which subset to support first, state your boundaries out loud, confirm them with the interviewer, and design so that new clauses slot in cleanly rather than forcing a rewrite.

Constraints & Assumptions

  • Tables are fully in memory (e.g., tables[name] = [ {col: value, ...}, ... ] ) with a known schema (column names + types). Data fits in memory.
  • Read-only queries ( SELECT ); no transactions, no persistence, no writes in the base version.
  • v1 target: SELECT <columns> FROM <table> WHERE <predicate> , where the predicate supports comparison operators ( = , != , < , <= , > , >= ) combined with AND / OR .
  • Extensions to be ready for: ORDER BY + LIMIT , aggregate functions ( COUNT , SUM , AVG , MIN , MAX ) with GROUP BY , and INNER JOIN of two tables.
  • Keywords may be treated case-insensitively; assume a simplified, well-formed grammar unless validation is explicitly requested.

Clarifying Questions to Ask

  • Which subset is in scope for v1, and how far do we extend — joins, grouping, subqueries? Where should I stop?
  • What column types exist, and how are NULL s handled in comparisons and in aggregates (SQL three-valued logic, or a simplified rule)?
  • Should I validate the query against the schema (error on unknown table/column), or assume valid input?
  • Is the input always a single statement? Can I assume a simplified grammar (no nested expressions in the select list to start)?
  • What is the expected output shape — an ordered list of rows with a defined column order — and how are ORDER BY ties broken?

Part 1 — Architecture: how the engine is structured

Before writing code, lay out the engine's stages and the boundaries between them. What are the components, what does each one consume and produce, and where would future features (JOIN, GROUP BY) plug in?

What This Part Should Cover Premium

Part 2 — Implement the core: SELECT … FROM … WHERE

Implement the base engine: parse and execute SELECT col1, col2 FROM table WHERE <predicate> over the in-memory tables, supporting the comparison operators and AND / OR in the predicate. Show both the parsing and the execution.

What This Part Should Cover Premium

Part 3 — Extend: ORDER BY, aggregation/GROUP BY, and JOIN

The interviewer keeps adding features. Extend the engine to support ORDER BY + LIMIT, aggregate functions with GROUP BY, and an INNER JOIN of two tables. Explain how each maps onto your operator model, and where you would set boundaries given limited time.

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

  • How would you support a JOIN of three or more tables, and how would join order affect performance?
  • How would you add HAVING and subqueries (e.g., WHERE col IN (SELECT ...) ) to your AST and executor?
  • Where would a query optimizer fit, and what is one rewrite (e.g., predicate pushdown) that would help?
  • How would you test this engine so you stay confident it's correct as you keep adding features?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Retell•More Software Engineer•Retell Software Engineer•Retell System Design•Software Engineer System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.