PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Amazon

Design a cloud database write path and recovery

Last updated: Apr 15, 2026

Quick Overview

This question evaluates expertise in designing transactional write paths and crash recovery for cloud-native relational databases that separate compute from durable distributed storage, focusing on durability guarantees, commit protocols, replication and quorum rules, crash recovery mechanisms, write amplification, scalability, and observability.

  • hard
  • Amazon
  • System Design
  • Software Engineer

Design a cloud database write path and recovery

Company: Amazon

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

## System Design (Engine-level): Write Path + Crash Recovery Design a core subsystem for a cloud-native relational database (Aurora-like) where **compute is separated from durable distributed storage**. ### Goal Support transactional writes with: - high throughput - low commit latency - crash recovery - strong durability guarantees (clearly specify what guarantees) ### Requirements / prompts 1. **Write path**: Describe how an UPDATE/INSERT flows from compute to durable storage. Where do you place the log (WAL)? 2. **Commit protocol**: When does a transaction commit succeed? What acknowledgements are required? 3. **Replication & consistency**: How many replicas, what quorum rules, and how do you handle network partitions? 4. **Crash recovery**: If the compute node crashes, how does a new node recover state and resume service? What data structures/checkpoints exist? 5. **Write amplification**: Identify sources (WAL, page rewrites, compaction) and propose reductions. 6. **Scalability**: How do you scale storage and compute independently? Discuss sharding, rebalancing, and hotspot handling. 7. **Observability**: What metrics and logs would you add to detect replication lag, redo backlog, and tail latency?

Quick Answer: This question evaluates expertise in designing transactional write paths and crash recovery for cloud-native relational databases that separate compute from durable distributed storage, focusing on durability guarantees, commit protocols, replication and quorum rules, crash recovery mechanisms, write amplification, scalability, and observability.

Related Interview Questions

  • Design a replicated cloud storage service - Amazon (hard)
  • Measure platform success and drive adoption - Amazon (medium)
  • Design multi-tenant ingestion and processing platform - Amazon (medium)
  • Design globally consistent metadata service - Amazon (medium)
  • Design a large-scale temperature sensor system - Amazon (medium)
Amazon logo
Amazon
Jan 22, 2026, 12:00 AM
Software Engineer
Technical Screen
System Design
13
0
Loading...

System Design (Engine-level): Write Path + Crash Recovery

Design a core subsystem for a cloud-native relational database (Aurora-like) where compute is separated from durable distributed storage.

Goal

Support transactional writes with:

  • high throughput
  • low commit latency
  • crash recovery
  • strong durability guarantees (clearly specify what guarantees)

Requirements / prompts

  1. Write path : Describe how an UPDATE/INSERT flows from compute to durable storage. Where do you place the log (WAL)?
  2. Commit protocol : When does a transaction commit succeed? What acknowledgements are required?
  3. Replication & consistency : How many replicas, what quorum rules, and how do you handle network partitions?
  4. Crash recovery : If the compute node crashes, how does a new node recover state and resume service? What data structures/checkpoints exist?
  5. Write amplification : Identify sources (WAL, page rewrites, compaction) and propose reductions.
  6. Scalability : How do you scale storage and compute independently? Discuss sharding, rebalancing, and hotspot handling.
  7. Observability : What metrics and logs would you add to detect replication lag, redo backlog, and tail latency?

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.