PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Startups.Com

Design NL-to-SQL performance optimization assistant

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design a secure, reliable, and cost-conscious natural-language-to-SQL performance diagnostic system that combines ML-based intent parsing, database internals, safe query execution, and operational observability.

  • medium
  • Startups.Com
  • System Design
  • Machine Learning Engineer

Design NL-to-SQL performance optimization assistant

Company: Startups.Com

Role: Machine Learning Engineer

Category: System Design

Difficulty: medium

Interview Round: Technical Screen

## Scenario You are building an internal tool for engineers/analysts to ask **natural-language questions about database performance** (e.g., “Why is the `orders` dashboard slow today?” or “Which queries are causing the most CPU?”). The system must: 1. Interpret the natural-language question. 2. Determine which diagnostic data to query (e.g., slow query logs, query stats, table/index metadata). 3. Generate and run the relevant SQL (and/or EXPLAIN plans) safely. 4. Analyze the results to identify root causes. 5. Produce actionable optimization recommendations (e.g., index suggestions, query rewrites, schema changes, config tuning), with evidence. Assume the database is a relational OLTP system (e.g., PostgreSQL/MySQL) with access to: - Query statistics (e.g., `pg_stat_statements`-like), slow logs - System metrics (CPU/IO), table/index metadata - Ability to run read-only SQL and optionally `EXPLAIN (ANALYZE)` in controlled environments ## Requirements ### Functional - NL question → structured intent (what to investigate, which time window, which service/table/query) - Generate safe SQL diagnostics queries - Execute queries with guardrails (timeouts, read-only, row limits) - Analyze results and produce ranked hypotheses - Output recommendations with confidence and supporting data - Ask clarifying questions when the NL query is ambiguous ### Non-functional - Security: least privilege, tenant isolation, PII protection - Reliability: handle partial data sources; graceful degradation - Latency: interactive (e.g., p95 < 5–10s for common questions) - Auditability: log prompts, SQL, results summaries, and decisions - Cost controls if using LLMs (caching, batching, smaller models) ## Deliverables Describe the end-to-end system design: architecture, main components, data flow, safety mechanisms, and how you would evaluate quality.

Quick Answer: This question evaluates a candidate's ability to design a secure, reliable, and cost-conscious natural-language-to-SQL performance diagnostic system that combines ML-based intent parsing, database internals, safe query execution, and operational observability.

Startups.Com logo
Startups.Com
Jan 6, 2026, 12:00 AM
Machine Learning Engineer
Technical Screen
System Design
2
0
Loading...

Scenario

You are building an internal tool for engineers/analysts to ask natural-language questions about database performance (e.g., “Why is the orders dashboard slow today?” or “Which queries are causing the most CPU?”).

The system must:

  1. Interpret the natural-language question.
  2. Determine which diagnostic data to query (e.g., slow query logs, query stats, table/index metadata).
  3. Generate and run the relevant SQL (and/or EXPLAIN plans) safely.
  4. Analyze the results to identify root causes.
  5. Produce actionable optimization recommendations (e.g., index suggestions, query rewrites, schema changes, config tuning), with evidence.

Assume the database is a relational OLTP system (e.g., PostgreSQL/MySQL) with access to:

  • Query statistics (e.g., pg_stat_statements -like), slow logs
  • System metrics (CPU/IO), table/index metadata
  • Ability to run read-only SQL and optionally EXPLAIN (ANALYZE) in controlled environments

Requirements

Functional

  • NL question → structured intent (what to investigate, which time window, which service/table/query)
  • Generate safe SQL diagnostics queries
  • Execute queries with guardrails (timeouts, read-only, row limits)
  • Analyze results and produce ranked hypotheses
  • Output recommendations with confidence and supporting data
  • Ask clarifying questions when the NL query is ambiguous

Non-functional

  • Security: least privilege, tenant isolation, PII protection
  • Reliability: handle partial data sources; graceful degradation
  • Latency: interactive (e.g., p95 < 5–10s for common questions)
  • Auditability: log prompts, SQL, results summaries, and decisions
  • Cost controls if using LLMs (caching, batching, smaller models)

Deliverables

Describe the end-to-end system design: architecture, main components, data flow, safety mechanisms, and how you would evaluate quality.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Startups.Com•More Machine Learning Engineer•Startups.Com Machine Learning Engineer•Startups.Com System Design•Machine Learning Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.