Compute peak concurrent drivers in 24 hours
Company: Rippling
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to design algorithms and data structures for streaming interval data, specifically computing peak distinct-driver concurrency over a half-open 24-hour time window with attention to boundary inclusivity and distinct-count semantics.
Constraints
- 1 <= len(operations) <= 20000
- For every add operation, start_time < end_time
- -10^9 <= start_time, end_time, T <= 10^9
- driver_id is an integer
- The query window is always the half-open interval [T - 24, T)
Examples
Input: ([("add", 1, 0, 10), ("add", 2, 5, 15), ("query", 12)],)
Expected Output: [2]
Explanation: In the window [-12, 12), driver 1 is active on [0, 10) and driver 2 on [5, 12). The peak overlap is 2 on [5, 10).
Input: ([("add", 1, 0, 10), ("add", 1, 5, 12), ("add", 2, 6, 8), ("query", 9), ("query", 13)],)
Expected Output: [2, 2]
Explanation: Driver 1's two intervals overlap, so they count as one distinct driver. Driver 2 overlaps driver 1 on [6, 8), making the peak 2 for both queries.
Input: ([("add", 1, 0, 5), ("add", 2, 5, 10), ("query", 10)],)
Expected Output: [1]
Explanation: End times are exclusive. Driver 1 is inactive at time 5, exactly when driver 2 becomes active, so the peak is never 2.
Input: ([("add", 1, 0, 30), ("query", 15), ("add", 2, 10, 20), ("query", 15)],)
Expected Output: [1, 2]
Explanation: Queries are processed in stream order. The first query sees only driver 1. The second query uses the same T but also sees driver 2's later insertion, so the peak becomes 2.
Input: ([("query", 0), ("add", 1, -30, -10), ("add", 2, -5, 5), ("query", 0)],)
Expected Output: [0, 1]
Explanation: The first query has no inserted intervals yet, so the answer is 0. For the second query, the window is [-24, 0): driver 1 is active on [-24, -10) and driver 2 on [-5, 0), so the peak is 1.
Solution
def solution(operations):
from bisect import bisect_left
has_query = False
coords = []
for op in operations:
if op[0] == 'add':
_, driver_id, start, end = op
coords.append(start)
coords.append(end)
else:
has_query = True
_, t = op
coords.append(t - 24)
coords.append(t)
if not has_query:
return []
coords = sorted(set(coords))
index = {x: i for i, x in enumerate(coords)}
nseg = len(coords) - 1
if nseg <= 0:
return [0 for op in operations if op[0] == 'query']
size = 1
while size < nseg:
size <<= 1
tree = [0] * (size * 2)
lazy = [0] * (size * 2)
def apply(node, val):
tree[node] += val
lazy[node] += val
def push(node):
if lazy[node]:
apply(node * 2, lazy[node])
apply(node * 2 + 1, lazy[node])
lazy[node] = 0
def range_add(node, left, right, ql, qr, val):
if ql >= right or qr <= left:
return
if ql <= left and right <= qr:
apply(node, val)
return
push(node)
mid = (left + right) // 2
range_add(node * 2, left, mid, ql, qr, val)
range_add(node * 2 + 1, mid, right, ql, qr, val)
tree[node] = tree[node * 2] if tree[node * 2] >= tree[node * 2 + 1] else tree[node * 2 + 1]
def range_max(node, left, right, ql, qr):
if ql >= right or qr <= left:
return 0
if ql <= left and right <= qr:
return tree[node]
push(node)
mid = (left + right) // 2
a = range_max(node * 2, left, mid, ql, qr)
b = range_max(node * 2 + 1, mid, right, ql, qr)
return a if a >= b else b
def add_interval(intervals, start, end):
i = bisect_left(intervals, (start, -10**30))
left = i
if left > 0 and intervals[left - 1][1] >= start:
left -= 1
uncovered = []
cur = start
n = len(intervals)
j = left
while j < n and intervals[j][0] <= end:
a, b = intervals[j]
if b < start:
j += 1
continue
if a > cur:
gap_end = a if a < end else end
if cur < gap_end:
uncovered.append((cur, gap_end))
if b > cur:
cur = b
if cur >= end:
break
j += 1
if cur < end:
uncovered.append((cur, end))
new_start, new_end = start, end
j = left
while j < n and intervals[j][0] <= new_end:
a, b = intervals[j]
if b < new_start:
j += 1
continue
if a < new_start:
new_start = a
if b > new_end:
new_end = b
j += 1
intervals[left:j] = [(new_start, new_end)]
return uncovered
merged = {}
answers = []
for op in operations:
if op[0] == 'add':
_, driver_id, start, end = op
intervals = merged.setdefault(driver_id, [])
for a, b in add_interval(intervals, start, end):
l = index[a]
r = index[b]
if l < r:
range_add(1, 0, size, l, r, 1)
else:
_, t = op
l = index[t - 24]
r = index[t]
answers.append(0 if l >= r else range_max(1, 0, size, l, r))
return answersTime complexity: Preprocessing coordinate compression takes O(U log U), where U is the number of unique timestamps from interval endpoints and query boundaries. Each add operation takes O(log k + m + p log U), where k is the number of merged intervals for that driver, m is the number of that driver's merged intervals touched/overlapped by the new interval, and p is the number of newly uncovered pieces produced by that insert. Each query takes O(log U).. Space complexity: O(U + total_merged_intervals_across_all_drivers).
Hints
- A single driver must never be counted twice at the same time. Maintain each driver's coverage as a sorted list of merged intervals, and when a new interval arrives, only add the parts not already covered by that driver.
- After coordinate compression, every newly uncovered sub-interval becomes a range +1 update on a segment tree, and every query becomes a range maximum query on [T - 24, T).