How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a hard difficulty System Design question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at OpenAI during technical interviews.

Design an in-memory database | OpenAI Interview Question

Quick Overview

This question evaluates expertise in system design, low-latency in-memory key–value data structures, durability mechanisms, replication and scaling, memory management, concurrency models, and operational concerns within the System Design domain for a Machine Learning Engineer role.

System Design: In-Memory Key–Value Database for Ultra–Low Latency

Context

You are designing an in-memory, per-node key–value database optimized for ultra–low-latency reads and writes. It must support point lookups, prefix scans, TTLs with automatic expiration, optional transactions with snapshot isolation, and high availability with specified durability targets.

Functional Requirements

Commands:
- SET(key, value[, ttl])
- GET(key)
- DELETE(key)
- MGET(keys)
- SCAN(prefix, limit, cursor)
Optional:
- Transactions with snapshot isolation (MULTI/EXEC)
- Atomic increments (e.g., INCRBY)
- Pub/Sub notifications on key changes

Non-Functional Requirements

Per-node throughput: 50k ops/s
Latency: p99 GET < 2 ms; p99 SET < 5 ms
Availability: 99.99%
Durability targets:
- RPO ≤ 1 s
- RTO ≤ 60 s on crash/restart

Sub-Questions

(a) Choose core data structures (e.g., hash table for point lookups, radix/ART or skip list for prefix scans). Explain complexity, memory overhead, and cache behavior.

(b) Provide durability: write-ahead log (WAL) and periodic snapshots; fsync policy, log compaction, and exact crash-recovery steps.

(c) Scale out: sharding via consistent hashing, leader–follower replication, read replicas, client routing, and failover. State the consistency model and how to achieve it.

(d) Manage memory: allocator strategy, fragmentation control, TTL wheel/timer, and eviction (LRU/LFU) when a memory cap is reached.

(e) Concurrency model: single-threaded event loop vs. multi-threaded; locking, batching, and pipelining trade-offs.

(f) Operations: metrics/slowlog, backups, online config changes, and capacity planning for 100M keys (avg value 200 B) with 64 GB RAM per node.

Quick Overview

Sub-Questions

(a) Choose core data structures (e.g., hash table for point lookups, radix/ART or skip list for prefix scans). Explain complexity, memory overhead, and cache behavior.

(b) Provide durability: write-ahead log (WAL) and periodic snapshots; fsync policy, log compaction, and exact crash-recovery steps.

(c) Scale out: sharding via consistent hashing, leader–follower replication, read replicas, client routing, and failover. State the consistency model and how to achieve it.

(d) Manage memory: allocator strategy, fragmentation control, TTL wheel/timer, and eviction (LRU/LFU) when a memory cap is reached.

(e) Concurrency model: single-threaded event loop vs. multi-threaded; locking, batching, and pipelining trade-offs.

(f) Operations: metrics/slowlog, backups, online config changes, and capacity planning for 100M keys (avg value 200 B) with 64 GB RAM per node.

Design an in-memory database

Quick Overview

System Design: In-Memory Key–Value Database for Ultra–Low Latency

Context

Functional Requirements

Non-Functional Requirements

Sub-Questions

Solution

Comments (0)

Design an in-memory database

Quick Overview

System Design: In-Memory Key–Value Database for Ultra–Low Latency

Context

Functional Requirements

Non-Functional Requirements

Sub-Questions

Solution

Comments (0)