Compute total active time per Twitter Space
Company: xAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to perform event-log sessionization and time-interval aggregation across users and spaces, testing skills in algorithms, state management, and event processing within the Coding & Algorithms domain.
Part 1: Compute Total Active Time per Twitter Space
Constraints
- 0 <= len(records) <= 200000
- operation is one of 'create', 'join', or 'leave'
- 1 <= len(space_id), len(user_id) <= 100
- 0 <= timestamp <= 10^18
- records are provided in nondecreasing timestamp order
- A user may have multiple non-overlapping sessions in the same space
Examples
Input: ([['create', 'abc', 'user_1', 1234567000], ['join', 'abc', 'user_2', 1234567100], ['leave', 'abc', 'user_2', 1234567300], ['create', 'def', 'user_2', 1234568000], ['leave', 'def', 'user_2', 1234568500], ['leave', 'abc', 'user_1', 1234569000]],)
Expected Output: {'abc': 2200, 'def': 500}
Explanation: user_1 spends 2000 seconds in abc, user_2 spends 200 seconds in abc, and user_2 spends 500 seconds in def.
Input: ([],)
Expected Output: {}
Explanation: There are no records, so there are no spaces to report.
Input: ([['create', 's1', 'u1', 0], ['leave', 's1', 'u1', 10], ['join', 's1', 'u1', 20], ['leave', 's1', 'u1', 25], ['join', 's1', 'u2', 30], ['leave', 's1', 'u2', 40]],)
Expected Output: {'s1': 25}
Explanation: The completed sessions last 10, 5, and 10 seconds, for a total of 25.
Input: ([['create', 's1', 'u1', 100], ['join', 's1', 'u2', 110], ['leave', 's1', 'u1', 160], ['leave', 's1', 'u2', 180], ['create', 's2', 'u1', 200], ['leave', 's2', 'u1', 200]],)
Expected Output: {'s1': 130, 's2': 0}
Explanation: s1 has overlapping users, so their times are summed: 60 + 70 = 130. s2 has a zero-length session.
Input: ([['leave', 'ghost', 'u1', 5], ['create', 'lonely', 'u2', 10]],)
Expected Output: {'ghost': 0, 'lonely': 0}
Explanation: The unmatched leave is ignored, and the still-open create session has no closing timestamp.
Hints
- Track currently active sessions by the pair (space_id, user_id).
- When a leave event arrives, the elapsed time is leave_timestamp - stored_join_timestamp.
Part 2: Real-Time Top-K Spaces by Active User Count
Constraints
- 0 <= len(events) <= 200000
- 0 <= k <= 1000
- operation is one of 'create', 'join', or 'leave'
- 1 <= len(space_id), len(user_id) <= 100
- A user may be active in multiple different spaces at the same time
- For one user-space pair, duplicate join/create events while already active do not change the active count
Examples
Input: ([['create', 'abc', 'u1'], ['join', 'abc', 'u2'], ['create', 'def', 'u2'], ['leave', 'abc', 'u1'], ['leave', 'abc', 'u2']], 2)
Expected Output: [[['abc', 1]], [['abc', 2]], [['abc', 2], ['def', 1]], [['abc', 1], ['def', 1]], [['def', 1]]]
Explanation: The snapshots show the top 2 active spaces after each event. Ties are ordered by space_id.
Input: ([], 3)
Expected Output: []
Explanation: With no events, there are no snapshots.
Input: ([['join', 'b', 'u1'], ['join', 'a', 'u2'], ['join', 'b', 'u3'], ['leave', 'b', 'u1']], 1)
Expected Output: [[['b', 1]], [['a', 1]], [['b', 2]], [['a', 1]]]
Explanation: When a and b both have count 1, a wins the lexicographic tie-break.
Input: ([['join', 'x', 'u1'], ['join', 'x', 'u1'], ['leave', 'x', 'u2'], ['leave', 'x', 'u1'], ['leave', 'x', 'u1']], 3)
Expected Output: [[['x', 1]], [['x', 1]], [['x', 1]], [], []]
Explanation: The duplicate join and invalid leaves do not change the count.
Input: ([['join', 'a', 'u1'], ['join', 'b', 'u2']], 0)
Expected Output: [[], []]
Explanation: When k is 0, each snapshot is empty, though the stream is still processed.
Hints
- Maintain a set of active (space_id, user_id) pairs and a count per space.
- A heap can support top-k queries, but counts change over time, so use lazy deletion with version numbers to ignore stale heap entries.