Implement DelayQueue with Idempotent Task Execution
Company: Pinterest
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates implementation skills in time-based task scheduling, idempotent execution semantics, and concurrency control, emphasizing data-structure selection and synchronization trade-offs.
Constraints
- 1 <= len(operations) <= 200000
- run_at and now are integers in the range [0, 10^12]
- id and task are non-empty strings of length <= 64
- Within a single poll, tasks execute in ascending run_at; ties broken by schedule order
- Each ID executes at most once across all polls; subsequent tasks with the same ID are skipped when popped
- Use O(n) additional space; aim for O(log n) schedule and pop operations
Solution
import heapq
def process_delay_queue(operations: list) -> list:
heap = [] # (run_at, seq, id, task)
executed = set()
results = []
seq = 0
for op in operations:
t = op.get("op")
if t == "schedule":
run_at = op["run_at"]
tid = op["id"]
task = op["task"]
heapq.heappush(heap, (run_at, seq, tid, task))
seq += 1
elif t == "poll":
now = op["now"]
out = []
while heap and heap[0][0] <= now:
run_at, s, tid, task = heapq.heappop(heap)
if tid in executed:
continue
executed.add(tid)
out.append([tid, task])
results.append(out)
else:
raise ValueError("Unknown operation: " + str(t))
return results
Explanation
Time complexity: Each schedule is O(log n). Each poll pops k tasks in O(k log n). Across all operations, total time is O((S + P) log S), where S is schedules and P is total popped tasks.. Space complexity: O(S) for the heap and executed-ID set, where S is the number of scheduled tasks..
Hints
- Maintain a min-heap keyed by (run_at, sequence_number) to fetch the next due task and break ties by insertion order.
- Use a hash set to record executed IDs; skip any popped task whose ID is already in the set.
- On poll(now), pop while heap top run_at <= now; for each popped task, atomically check-and-add the ID to the executed set before executing.
- Return tasks executed in the order they are processed during the poll.