Implement write with internal buffer
Company: Datadog
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Implement a buffered writer over an expensive sink API. You are given a function writeToDevice(byte[] chunk) that may accept at most M bytes per call and has high per-call overhead. Implement a class BufferedWriter with methods: write(byte[] data), flush(), and close(). Requirements: maintain byte order across multiple calls, minimize the number of writeToDevice calls, handle inputs larger than the internal buffer, and ensure no data loss on flush/close. Describe your buffer management, edge cases across consecutive writes, and the time/space complexity. Optionally discuss how you would add thread-safety.
Quick Answer: This question evaluates proficiency with buffered I/O, buffer management, memory handling, and performance optimization, including maintaining byte order, handling partial writes, and edge cases across consecutive write/flush/close operations.
You must simulate a buffered writer over an expensive device API. The device API writeToDevice may accept at most M bytes per call. Implement a function process_operations(ops, M, B) that processes a sequence of operations on a BufferedWriter with internal capacity B and returns the exact sequence of chunks passed to writeToDevice, preserving byte order and obeying the limit M. Operations are objects: {"op": "write", "data": [ints 0..255]}, {"op": "flush"}, or {"op": "close"}. Semantics and policy (to make output deterministic and minimize device calls): 1) write(data): append bytes to the internal buffer; if the buffer becomes full, immediately write min(M, current_buffer_size) bytes from the front of the buffer to the device to free space; repeat as needed while processing the same write call. 2) If the buffer is empty and the current write call has at least B bytes remaining, bypass the buffer and write chunks directly to the device in maximum chunks of size up to M, repeating until fewer than B bytes remain in this write; any remaining (< B) bytes are buffered. 3) flush(): write all remaining buffered bytes to the device in chunks of size up to M, leaving the buffer empty. 4) close(): same as flush(), then no further writes should occur (tests will not issue writes after close). Return the list of device chunks in call order. No device write may exceed M bytes, and no data may be lost.
Constraints
- 1 <= M <= 100000
- 1 <= B <= 100000
- 1 <= len(ops) <= 100000
- Sum of lengths of all write data <= 200000
- Byte values are integers in [0, 255]
- Operations are processed sequentially; writes after close will not be present in tests
Solution
from collections import deque
def process_operations(ops: list, M: int, B: int) -> list:
device_calls = []
def writeToDevice(chunk):
if not chunk:
return
if len(chunk) > M:
raise ValueError("Device call exceeds M bytes")
device_calls.append(list(chunk))
class BufferedWriter:
def __init__(self, M: int, B: int, sink):
self.M = M
self.B = B
self.sink = sink
self.buf = deque() # holds ints
self.closed = False
def _spill_from_buffer(self):
if not self.buf:
return
k = min(self.M, len(self.buf))
out = []
for _ in range(k):
out.append(self.buf.popleft())
self.sink(out)
def write(self, data):
if self.closed:
raise ValueError("write after close")
i = 0
n = len(data)
while i < n:
# Direct pass when buffer is empty and the current write has at least B bytes remaining
if not self.buf and (n - i) >= self.B:
send = min(self.M, n - i)
self.sink(data[i:i + send])
i += send
continue
# Ensure there is space in the buffer; if full, spill to free space
if len(self.buf) == self.B:
self._spill_from_buffer()
continue
# Buffer as much as fits
space = self.B - len(self.buf)
take = min(space, n - i)
if take > 0:
self.buf.extend(data[i:i + take])
i += take
# Next loop iteration will spill if needed
def flush(self):
while self.buf:
self._spill_from_buffer()
def close(self):
if not self.closed:
self.flush()
self.closed = True
writer = BufferedWriter(M, B, writeToDevice)
for op in ops:
typ = op.get('op')
if typ == 'write':
data = op.get('data', [])
writer.write(data)
elif typ == 'flush':
writer.flush()
elif typ == 'close':
writer.close()
else:
raise ValueError(f"Unknown operation: {typ}")
return device_calls
Explanation
Maintain a FIFO buffer (deque) with capacity B. On write(data), if the buffer is empty and at least B bytes remain in this write, bypass the buffer and send chunks directly to the device with size up to M, repeating until fewer than B remain. Otherwise buffer bytes until full; when full, spill min(M, len(buffer)) bytes from the front to free space, which minimizes calls by keeping the small tail in the buffer for future coalescing. flush() drains the buffer fully in chunks of size up to M. close() calls flush() and marks the writer closed. All device writes preserve byte order and never exceed M bytes.
Time complexity: O(total_bytes). Space complexity: O(B).
Hints
- Use a FIFO structure (e.g., deque) to implement the internal buffer efficiently.
- When the buffer is full during write(), spill only min(M, len(buffer)) bytes to free space; do not drain the small tail unless flushing.
- When the buffer is empty and the write batch is large (>= B), bypass the buffer and send maximum chunks (size up to M) directly.
- flush() must drain the buffer completely in chunks of size up to M; close() should call flush().