Design test-run logging and query functions
Company: Vanta
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates proficiency in time-series logging, event-driven state transitions, interval and sliding-window algorithms, and efficient data-structure design for streaming updates.
Part 1: Minimum Time for a Test to Recover
Constraints
- 0 <= len(logs) <= 200000
- 0 <= len(query_test_ids) <= 200000
- 1 <= test_id <= 10^9
- 1 <= timestamp <= 10^9
- Timestamps in logs are strictly increasing
- status is always 0 or 1
Examples
Input: ([[1, 1, 0], [2, 2, 0], [1, 4, 0], [1, 5, 1], [2, 6, 1], [1, 10, 0], [1, 13, 1]], [1, 2, 3])
Expected Output: [3, 4, -1]
Explanation: Test 1 has recovery times 5-1=4 and 13-10=3, so the minimum is 3. Test 2 has one recovery time 6-2=4. Test 3 never recovers.
Input: ([[1, 1, 1], [1, 2, 1], [1, 3, 0], [1, 7, 1], [1, 10, 0], [1, 12, 1]], [1])
Expected Output: [2]
Explanation: Initial pass logs do not start a failure series. The completed fail-to-pass durations are 7-3=4 and 12-10=2.
Input: ([[1, 1, 0], [1, 3, 0], [2, 5, 1]], [1, 2])
Expected Output: [-1, -1]
Explanation: Test 1 starts failing but never passes. Test 2 never has a fail-to-pass sequence.
Input: ([], [1])
Expected Output: [-1]
Explanation: With no logs, no test can have a completed recovery.
Solution
def solution(logs, query_test_ids):
current_fail_start = {}
best_time = {}
for test_id, timestamp, status in logs:
if status == 0:
if test_id not in current_fail_start:
current_fail_start[test_id] = timestamp
else:
if test_id in current_fail_start:
duration = timestamp - current_fail_start.pop(test_id)
if test_id not in best_time or duration < best_time[test_id]:
best_time[test_id] = duration
return [best_time.get(test_id, -1) for test_id in query_test_ids]
Time complexity: O(n + q), where n is the number of logs and q is the number of queries. Space complexity: O(u), where u is the number of distinct test IDs seen in the logs.
Hints
- Track, for each test, the first timestamp of its current consecutive fail streak.
- Only when a pass closes an active fail streak do you get a candidate recovery time.
Part 2: Longest Window with At Least K Failing Tests
Constraints
- 0 <= len(logs) <= 200000
- 1 <= min_tests <= 200000
- 1 <= test_id <= 10^9
- 1 <= timestamp <= 10^9
- Timestamps in logs are strictly increasing
- status is always 0 or 1
Examples
Input: ([[1, 1, 0], [2, 2, 0], [1, 5, 1], [3, 6, 0], [2, 8, 1], [3, 10, 1]], 2)
Expected Output: [2, 5]
Explanation: At least 2 tests are failing from timestamp 2 until the log at timestamp 5 drops the count below 2. Another valid window is [6, 8], but it is shorter.
Input: ([[1, 2, 0], [2, 4, 0], [1, 6, 0]], 2)
Expected Output: [4, 6]
Explanation: The threshold is first reached at timestamp 4 and remains satisfied through the final log, so the window ends at timestamp 6.
Input: ([[1, 1, 0], [2, 2, 0], [1, 4, 1], [3, 5, 0], [2, 7, 1], [3, 8, 1]], 2)
Expected Output: [2, 4]
Explanation: There are two windows of equal length: [2, 4] and [5, 7]. The earlier one is returned.
Input: ([[1, 1, 0], [1, 2, 1], [2, 3, 0]], 2)
Expected Output: []
Explanation: The number of failing tests never reaches 2.
Input: ([], 1)
Expected Output: []
Explanation: With no logs, no valid window exists.
Solution
def solution(logs, min_tests):
if min_tests <= 0:
return []
failing = set()
active_start = None
best_start = None
best_end = None
last_timestamp = None
for test_id, timestamp, status in logs:
last_timestamp = timestamp
prev_count = len(failing)
if status == 0:
failing.add(test_id)
else:
failing.discard(test_id)
curr_count = len(failing)
if prev_count < min_tests and curr_count >= min_tests:
active_start = timestamp
elif prev_count >= min_tests and curr_count < min_tests:
start = active_start
end = timestamp
if best_start is None or end - start > best_end - best_start or (
end - start == best_end - best_start and start < best_start
):
best_start, best_end = start, end
active_start = None
if active_start is not None and last_timestamp is not None:
start = active_start
end = last_timestamp
if best_start is None or end - start > best_end - best_start or (
end - start == best_end - best_start and start < best_start
):
best_start, best_end = start, end
return [] if best_start is None else [best_start, best_end]
Time complexity: O(n), where n is the number of logs. Space complexity: O(u), where u is the number of distinct test IDs that are currently tracked.
Hints
- Maintain the current set of failing tests while you scan the logs from left to right.
- A candidate window only starts when the failing count crosses the threshold upward, and only ends when it crosses downward.