PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/System Design/Databricks

Design a Thread-Safe Buffered Writer with a Background Flush Thread

Last updated: Jul 1, 2026

Quick Overview

This question assesses a candidate's ability to design a concurrent, thread-safe component that decouples fast in-memory writes from slower disk I/O using a dedicated background thread. It probes practical understanding of concurrency control, buffering and batching strategies, backpressure, and durability guarantees in a systems design context. It is commonly used to test applied, real-world concurrent-systems reasoning rather than pure algorithmic recall.

  • medium
  • Databricks
  • System Design
  • Software Engineer

Design a Thread-Safe Buffered Writer with a Background Flush Thread

Company: Databricks

Role: Software Engineer

Category: System Design

Difficulty: medium

Interview Round: Onsite

## Design a Thread-Safe Buffered Writer with a Background Flush Thread Design a reusable in-process component — call it `BufferedDiskWriter` — that many application threads can write records to concurrently, while a single dedicated **master (flusher) thread** is responsible for actually persisting the buffered data to disk. The goal is to keep the producing threads fast (they should not each block on disk I/O), batch many small writes into larger sequential disk writes, and still give callers a clear story about when their data is durable. Think of this as the engine behind an application log, a write-ahead log, or a metrics/event sink: hundreds of producer threads call `append(record)` at high frequency, and you do not want every call to issue its own `write()`/`fsync()` syscall. This is a **design discussion**, not a coding exercise. Walk through the API, the concurrency model, the data structures, the flush policy, and the failure behavior. Justify your trade-offs out loud; pseudo-code for the hot paths is welcome but a fully compiling implementation is not expected. ### Constraints & Assumptions - Single process, single machine, multiple threads (tens to low hundreds of producer threads). Exactly **one** master thread owns the file handle and performs disk writes. - Records are small (tens to a few hundred bytes) opaque byte blobs; throughput can reach `10^5`–`10^6` `append` calls/second in bursts. - The backing store is a local append-only file on a single disk/SSD. - Producers strongly prefer low, predictable latency on `append`; a small bound on how long data may sit unflushed (e.g. tens of milliseconds) is acceptable for the default mode. - Memory is bounded — you cannot buffer an unbounded backlog if the disk falls behind. ### Clarifying Questions to Ask - **Durability contract:** Must `append` return only after the record is durable on disk, or is it fire-and-forget with durability happening asynchronously? Is there a separate `flush()`/`sync()` the caller can await? - **Ordering:** Do we need a total order across all producers, or only per-producer (per-thread) ordering of that thread's own records? - **Loss tolerance on crash:** On process crash or power loss, is it acceptable to lose the last few milliseconds of un-fsync'd records, or is zero loss required for acknowledged writes? - **Backpressure policy:** When producers outrun the disk, should `append` block, drop records, or fail fast? Which records are droppable? - **Record framing & corruption:** Do we own the on-disk format (length prefix, checksums, record boundaries), and must a reader be able to recover after a torn final write? ### Part 1 — Core concurrency model and API Define the public API of `BufferedDiskWriter` and the mechanism by which many producer threads hand records to the single master thread without serializing on a global lock for every byte written. Describe the in-memory data structure(s), how producers enqueue, and how the master thread drains and writes in batches. ```hint Where to start Separate the *producer* fast path (enqueue into memory) from the *consumer* slow path (the master thread doing `write`/`fsync`). The only contention is on the handoff structure, not on disk. ``` ```hint Data structure Consider double buffering (an "active" buffer producers fill and a "flushing" buffer the master drains), or a bounded multi-producer/single-consumer queue. A lock held only for the O(1) enqueue/swap — not for the disk write — keeps the critical section tiny. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 2 — Flush policy and backpressure Specify when the master thread flushes, and what happens when producers generate records faster than the disk can absorb them. The component must not consume unbounded memory. ```hint Triggers Flush on any of: a size threshold (buffer reached N bytes), a time threshold (at most T milliseconds of latency), or an explicit `flush()` call. "Group commit" amortizes one `fsync` across many records. ``` ```hint Backpressure A bounded buffer forces a decision when full: block the producer (apply backpressure), drop/sample records (acceptable for some logs/metrics), or reject with an error. State which one and why for the target use case. ``` #### Clarifying Questions for this Part - Is this primarily a metrics/log sink where dropping the least-important records under overload is acceptable, or a write-ahead log where every acknowledged record must survive? #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### Part 3 — Durability, failure handling, and shutdown Describe the durability guarantee you actually provide and how you handle failures: a slow or full disk, an I/O error from `write`/`fsync`, a crash midway through a write, and a clean `close()`. ```hint Durability primitive A successful `write()` only reaches the OS page cache; durability requires `fsync`/`fdatasync`. Decide where `fsync` sits relative to acknowledging a `flush()` so you do not over-promise. ``` ```hint Crash safety Frame each record (length prefix + checksum) so a reader can detect and discard a torn trailing record after a crash, and so corruption does not desync the whole file. ``` #### What This Part Should Cover ```premium-lock What This Part Should Cover ``` ### What a Strong Answer Covers ```premium-lock What a Strong Answer Covers ``` ### Follow-up Questions - How would you provide a `flush()` that a producer can **await** until its already-appended records are durable, without blocking other producers? (Hint: sequence numbers + a "durable up to N" watermark the producer can wait on.) - Suppose a single master thread + `fsync` becomes the bottleneck. How would you scale write throughput — multiple files/shards, multiple flusher threads, direct I/O, or group-commit tuning — and what ordering guarantees would you lose? - If you needed total ordering across all producers (not just per-thread), how would you assign order, and what does that cost on the hot path? - How would you make the component observable in production (queue depth, flush latency, drop counts, fsync errors), and which metric would page you first?

Quick Answer: This question assesses a candidate's ability to design a concurrent, thread-safe component that decouples fast in-memory writes from slower disk I/O using a dedicated background thread. It probes practical understanding of concurrency control, buffering and batching strategies, backpressure, and durability guarantees in a systems design context. It is commonly used to test applied, real-world concurrent-systems reasoning rather than pure algorithmic recall.

Related Interview Questions

  • Design a Slack-Like Messaging System - Databricks (medium)
  • Design a Book Price Aggregator - Databricks (medium)
  • Design a Distributed File System - Databricks (medium)
  • Design a stock order manager - Databricks (medium)
  • Design a Hierarchical File System - Databricks (hard)
|Home/System Design/Databricks

Design a Thread-Safe Buffered Writer with a Background Flush Thread

Databricks logo
Databricks
Jun 23, 2026, 12:00 AM
mediumSoftware EngineerOnsiteSystem Design
0
0

Design a Thread-Safe Buffered Writer with a Background Flush Thread

Design a reusable in-process component — call it BufferedDiskWriter — that many application threads can write records to concurrently, while a single dedicated master (flusher) thread is responsible for actually persisting the buffered data to disk. The goal is to keep the producing threads fast (they should not each block on disk I/O), batch many small writes into larger sequential disk writes, and still give callers a clear story about when their data is durable.

Think of this as the engine behind an application log, a write-ahead log, or a metrics/event sink: hundreds of producer threads call append(record) at high frequency, and you do not want every call to issue its own write()/fsync() syscall.

This is a design discussion, not a coding exercise. Walk through the API, the concurrency model, the data structures, the flush policy, and the failure behavior. Justify your trade-offs out loud; pseudo-code for the hot paths is welcome but a fully compiling implementation is not expected.

Constraints & Assumptions

  • Single process, single machine, multiple threads (tens to low hundreds of producer threads). Exactly one master thread owns the file handle and performs disk writes.
  • Records are small (tens to a few hundred bytes) opaque byte blobs; throughput can reach 10^5 – 10^6 append calls/second in bursts.
  • The backing store is a local append-only file on a single disk/SSD.
  • Producers strongly prefer low, predictable latency on append ; a small bound on how long data may sit unflushed (e.g. tens of milliseconds) is acceptable for the default mode.
  • Memory is bounded — you cannot buffer an unbounded backlog if the disk falls behind.

Clarifying Questions to Ask

  • Durability contract: Must append return only after the record is durable on disk, or is it fire-and-forget with durability happening asynchronously? Is there a separate flush() / sync() the caller can await?
  • Ordering: Do we need a total order across all producers, or only per-producer (per-thread) ordering of that thread's own records?
  • Loss tolerance on crash: On process crash or power loss, is it acceptable to lose the last few milliseconds of un-fsync'd records, or is zero loss required for acknowledged writes?
  • Backpressure policy: When producers outrun the disk, should append block, drop records, or fail fast? Which records are droppable?
  • Record framing & corruption: Do we own the on-disk format (length prefix, checksums, record boundaries), and must a reader be able to recover after a torn final write?

Part 1 — Core concurrency model and API

Define the public API of BufferedDiskWriter and the mechanism by which many producer threads hand records to the single master thread without serializing on a global lock for every byte written. Describe the in-memory data structure(s), how producers enqueue, and how the master thread drains and writes in batches.

What This Part Should Cover Premium

Part 2 — Flush policy and backpressure

Specify when the master thread flushes, and what happens when producers generate records faster than the disk can absorb them. The component must not consume unbounded memory.

Clarifying Questions for this Part

  • Is this primarily a metrics/log sink where dropping the least-important records under overload is acceptable, or a write-ahead log where every acknowledged record must survive?

What This Part Should Cover Premium

Part 3 — Durability, failure handling, and shutdown

Describe the durability guarantee you actually provide and how you handle failures: a slow or full disk, an I/O error from write/fsync, a crash midway through a write, and a clean close().

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

  • How would you provide a flush() that a producer can await until its already-appended records are durable, without blocking other producers? (Hint: sequence numbers + a "durable up to N" watermark the producer can wait on.)
  • Suppose a single master thread + fsync becomes the bottleneck. How would you scale write throughput — multiple files/shards, multiple flusher threads, direct I/O, or group-commit tuning — and what ordering guarantees would you lose?
  • If you needed total ordering across all producers (not just per-thread), how would you assign order, and what does that cost on the hot path?
  • How would you make the component observable in production (queue depth, flush latency, drop counts, fsync errors), and which metric would page you first?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Databricks•More Software Engineer•Databricks Software Engineer•Databricks System Design•Software Engineer System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.