How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a Medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Datadog.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Datadog during technical interviews.

Design log queries and a buffered writer

Quick Overview

This question evaluates understanding of time-indexed data structures and timestamp granularity handling for range queries, along with buffered I/O mechanics including partial-write semantics, error handling, ordering guarantees, and concurrency control.

Company: Datadog

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Part A — Log store with time-range queries: Implement a data structure that ingests log entries with ISO-8601 timestamps (e.g., YYYY-MM-DD HH:MM:SS) and unique IDs. Support add(id, timestamp) and query(start, end, granularity) where granularity ∈ {Year, Month, Day, Hour, Minute, Second}; query returns IDs whose timestamps fall within [start, end] at the specified granularity. Describe your data structures, update/query time and space complexities, and how you maintain ordering and memory usage. Part B — Buffered writer: Implement a BufferedWriter over an abstract sink write(bytes) that may accept partial writes. Support write(data), flush(), and close(); preserve ordering, chunk large inputs into fixed-size buffers, minimize system calls, and handle errors/partial writes. Discuss thread-safety options and complexity trade-offs.

Quick Answer: This question evaluates understanding of time-indexed data structures and timestamp granularity handling for range queries, along with buffered I/O mechanics including partial-write semantics, error handling, ordering guarantees, and concurrency control.

Part 1: Log Store with Time-Range Queries

Implement a batch simulator for a log store. Each log has a unique integer ID and a timestamp in fixed ISO-8601 format: YYYY-MM-DD HH:MM:SS. You are given operations of two types: add a log entry, or query logs in a time range at a requested granularity. For a query, compare timestamps only up to the requested granularity. For example, with granularity Month, the timestamp 2020-05-20 10:20:30 is treated as 2020-05. Return matching IDs in chronological timestamp order; if multiple logs have the same timestamp, return them in the order they were added.

Constraints

1 <= len(operations) <= 10000
Each timestamp is a valid zero-padded string in format YYYY-MM-DD HH:MM:SS
Log IDs are unique positive integers represented as strings in add operations
start <= end when compared at full timestamp precision
granularity is one of: Year, Month, Day, Hour, Minute, Second

Examples

Input: ([['add', '1', '2017-01-01 23:59:59'], ['add', '2', '2017-01-02 00:00:00'], ['add', '3', '2018-03-01 12:00:00'], ['query', '2017-01-01 00:00:00', '2017-12-31 23:59:59', 'Year'], ['query', '2017-01-01 23:00:00', '2017-01-01 23:59:00', 'Hour']],)

Expected Output: [[1, 2], [1]]

Explanation: The Year query includes both 2017 logs. The Hour query includes only the log in hour 2017-01-01 23.

Input: ([['add', '10', '2020-05-20 10:20:30'], ['add', '11', '2020-05-01 00:00:00'], ['add', '12', '2020-05-20 10:20:30'], ['query', '2020-05-01 00:00:00', '2020-05-31 23:59:59', 'Month'], ['query', '2020-05-20 00:00:00', '2020-05-20 23:59:59', 'Day']],)

Expected Output: [[11, 10, 12], [10, 12]]

Explanation: Results are chronological. Logs 10 and 12 have the same timestamp, so their add order is preserved.

Input: ([['query', '2019-01-01 00:00:00', '2019-12-31 23:59:59', 'Year']],)

Expected Output: [[]]

Explanation: Edge case: querying an empty log store returns an empty result.

Input: ([['add', '5', '2021-12-31 23:59:59'], ['query', '2021-12-31 23:59:59', '2021-12-31 23:59:59', 'Second'], ['query', '2022-01-01 00:00:00', '2022-01-01 00:00:00', 'Minute']],)

Expected Output: [[5], []]

Explanation: The exact-second query finds log 5. The later minute query does not.

Input: ([['query', '2019-01-01 00:00:00', '2019-12-31 23:59:59', 'Year'], ['add', '7', '2019-06-15 08:30:00'], ['query', '2019-06-01 00:00:00', '2019-06-30 23:59:59', 'Month'], ['add', '8', '2019-06-01 00:00:00'], ['query', '2019-06-01 00:00:00', '2019-06-30 23:59:59', 'Month']],)

Expected Output: [[], [7], [8, 7]]

Explanation: Queries reflect only logs added before that query. The final query returns June logs in chronological order.

Hints

Because the timestamp format is fixed-width and zero-padded, lexicographic ordering is the same as chronological ordering.
For each granularity, convert the query into a full lower-bound timestamp and a full upper-bound timestamp, then binary search in a sorted list.

Part 2: Buffered Writer with Partial Sink Writes

Implement a batch simulator for a buffered writer. The writer has an internal fixed-size buffer of size buffer_size. It writes to an abstract sink whose low-level write call may accept only part of the bytes: each sink call accepts at most max_accept characters from the front of the chunk passed to it. Your writer must preserve ordering, only drain the internal buffer when it becomes full or when flush/close is called, and retry partial writes until the drained chunk is fully accepted. The function returns the sequence of chunks actually accepted by the sink. If write, flush, or close is called after the writer is already closed, append the string 'ERROR' to the returned events and ignore that operation.

Constraints

1 <= buffer_size <= 100000
1 <= max_accept <= 100000
1 <= len(operations) <= 10000
The total length of all write data strings is <= 200000
Write data contains only lowercase English letters, so it cannot conflict with the special 'ERROR' event

Examples

Input: (4, 10, [['write', 'ab'], ['write', 'cd'], ['write', 'e'], ['flush']])

Expected Output: ['abcd', 'e']

Explanation: The first two writes fill the 4-character buffer, causing one sink call. The final partial buffer is written by flush.

Input: (5, 2, [['write', 'abcdef'], ['close']])

Expected Output: ['ab', 'cd', 'e', 'f']

Explanation: The full buffer chunk 'abcde' is drained through partial sink writes of size at most 2. Then close drains the remaining 'f'.

Input: (3, 2, [['write', 'a'], ['write', 'bcdef'], ['flush']])

Expected Output: ['ab', 'c', 'de', 'f']

Explanation: The bytes are chunked into full internal buffers 'abc' and 'def'. Each is partially accepted by the sink.

Input: (4, 2, [['flush'], ['write', ''], ['close']])

Expected Output: []

Explanation: Edge case: flushing or closing an empty buffer causes no sink calls, and an empty write has no effect.

Input: (3, 2, [['write', 'ab'], ['close'], ['write', 'c'], ['flush'], ['close']])

Expected Output: ['ab', 'ERROR', 'ERROR', 'ERROR']

Explanation: close drains the pending bytes and closes the writer. All later operations are invalid.

Hints

Separate two ideas: filling the internal buffer, and draining a buffer chunk through a sink that may accept only a prefix.
When draining, keep calling the sink on the remaining suffix until every byte from that buffer chunk has been accepted.

Quick Overview