Design NL-to-SQL performance optimization assistant
Company: Startups.Com
Role: Machine Learning Engineer
Category: System Design
Difficulty: medium
Interview Round: Technical Screen
## Scenario
You are building an internal tool for engineers/analysts to ask **natural-language questions about database performance** (e.g., “Why is the `orders` dashboard slow today?” or “Which queries are causing the most CPU?”).
The system must:
1. Interpret the natural-language question.
2. Determine which diagnostic data to query (e.g., slow query logs, query stats, table/index metadata).
3. Generate and run the relevant SQL (and/or EXPLAIN plans) safely.
4. Analyze the results to identify root causes.
5. Produce actionable optimization recommendations (e.g., index suggestions, query rewrites, schema changes, config tuning), with evidence.
Assume the database is a relational OLTP system (e.g., PostgreSQL/MySQL) with access to:
- Query statistics (e.g., `pg_stat_statements`-like), slow logs
- System metrics (CPU/IO), table/index metadata
- Ability to run read-only SQL and optionally `EXPLAIN (ANALYZE)` in controlled environments
## Requirements
### Functional
- NL question → structured intent (what to investigate, which time window, which service/table/query)
- Generate safe SQL diagnostics queries
- Execute queries with guardrails (timeouts, read-only, row limits)
- Analyze results and produce ranked hypotheses
- Output recommendations with confidence and supporting data
- Ask clarifying questions when the NL query is ambiguous
### Non-functional
- Security: least privilege, tenant isolation, PII protection
- Reliability: handle partial data sources; graceful degradation
- Latency: interactive (e.g., p95 < 5–10s for common questions)
- Auditability: log prompts, SQL, results summaries, and decisions
- Cost controls if using LLMs (caching, batching, smaller models)
## Deliverables
Describe the end-to-end system design: architecture, main components, data flow, safety mechanisms, and how you would evaluate quality.
Quick Answer: This question evaluates a candidate's ability to design a secure, reliable, and cost-conscious natural-language-to-SQL performance diagnostic system that combines ML-based intent parsing, database internals, safe query execution, and operational observability.