PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/System Design/OpenAI

Design an in-memory database

Last updated: May 18, 2026

Quick Overview

This question evaluates expertise in system design, low-latency in-memory key–value data structures, durability mechanisms, replication and scaling, memory management, concurrency models, and operational concerns within the System Design domain for a Machine Learning Engineer role.

  • hard
  • OpenAI
  • System Design
  • Machine Learning Engineer

Design an in-memory database

Company: OpenAI

Role: Machine Learning Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design an in-memory key–value database for ultra–low latency reads and writes. Functional requirements: - Commands: SET(key, value[, ttl]), GET(key), DELETE(key), MGET(keys), SCAN(prefix, limit, cursor). - Optional transactions with snapshot isolation (MULTI/EXEC) and atomic increments. - TTL with automatic expiration; optional pub/sub on key changes. Non-functional requirements: - Per-node: 50k ops/s; p99 GET < 2 ms, p99 SET < 5 ms; availability 99.99%. - Durability targets: RPO ≤ 1 s, RTO ≤ 60 s on crash/restart. Sub-questions: (a) Choose core data structures (e.g., hash table for point lookups, radix/ART or skip list for prefix scans). Explain complexity, memory overhead, and cache behavior. (b) Provide durability: write-ahead log and periodic snapshots; fsync policy, log compaction, and exact crash-recovery steps. (c) Scale out: sharding via consistent hashing, leader–follower replication, read replicas, client routing, and failover. State the consistency model and how to achieve it. (d) Manage memory: allocator strategy, fragmentation control, TTL wheel/timer, and eviction (LRU/LFU) when a memory cap is reached. (e) Concurrency model: single-threaded event loop vs. multi-threaded; locking, batching, and pipelining trade-offs. (f) Operations: metrics/slowlog, backups, online config changes, and capacity planning for 100M keys (avg value 200 B) with 64 GB RAM per node.

Quick Answer: This question evaluates expertise in system design, low-latency in-memory key–value data structures, durability mechanisms, replication and scaling, memory management, concurrency models, and operational concerns within the System Design domain for a Machine Learning Engineer role.

Related Interview Questions

  • Design a Distributed Crossword Solver - OpenAI (hard)
  • Design a Distributed Rate Limiter - OpenAI
  • Design a Distributed Crossword Solver - OpenAI (medium)
  • Design Mobile Model Usage Quotas - OpenAI (medium)
  • Design a Slack-Like Messaging System - OpenAI (medium)
OpenAI logo
OpenAI
Jul 15, 2025, 12:00 AM
Machine Learning Engineer
Technical Screen
System Design
10
0

System Design: In-Memory Key–Value Database for Ultra–Low Latency

Context

You are designing an in-memory, per-node key–value database optimized for ultra–low-latency reads and writes. It must support point lookups, prefix scans, TTLs with automatic expiration, optional transactions with snapshot isolation, and high availability with specified durability targets.

Functional Requirements

  • Commands:
    • SET(key, value[, ttl])
    • GET(key)
    • DELETE(key)
    • MGET(keys)
    • SCAN(prefix, limit, cursor)
  • Optional:
    • Transactions with snapshot isolation (MULTI/EXEC)
    • Atomic increments (e.g., INCRBY)
    • Pub/Sub notifications on key changes

Non-Functional Requirements

  • Per-node throughput: 50k ops/s
  • Latency: p99 GET < 2 ms; p99 SET < 5 ms
  • Availability: 99.99%
  • Durability targets:
    • RPO ≤ 1 s
    • RTO ≤ 60 s on crash/restart

Sub-Questions

(a) Choose core data structures (e.g., hash table for point lookups, radix/ART or skip list for prefix scans). Explain complexity, memory overhead, and cache behavior.

(b) Provide durability: write-ahead log (WAL) and periodic snapshots; fsync policy, log compaction, and exact crash-recovery steps.

(c) Scale out: sharding via consistent hashing, leader–follower replication, read replicas, client routing, and failover. State the consistency model and how to achieve it.

(d) Manage memory: allocator strategy, fragmentation control, TTL wheel/timer, and eviction (LRU/LFU) when a memory cap is reached.

(e) Concurrency model: single-threaded event loop vs. multi-threaded; locking, batching, and pipelining trade-offs.

(f) Operations: metrics/slowlog, backups, online config changes, and capacity planning for 100M keys (avg value 200 B) with 64 GB RAM per node.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More OpenAI•More Machine Learning Engineer•OpenAI Machine Learning Engineer•OpenAI System Design•Machine Learning Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.