Compute party time blocks per neighborhood
Company: Scale AI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to perform interval merging and temporal aggregation across joined datasets, testing algorithmic efficiency, data-processing skills, and handling of sorting and grouping in the Coding & Algorithms domain.
Constraints
- 0 <= len(parties) == len(party_locations) <= 100000
- Each `party_id` appears exactly once in `parties` and exactly once in `party_locations`
- `start_timestamp < end_timestamp` for every party
- Timestamps are integers and may be negative
Examples
Input: ([(1, 1, 5), (2, 3, 7), (3, 7, 10), (4, 12, 15), (5, 2, 4)], [(1, 'Downtown', 'Austin', 'TX'), (2, 'Downtown', 'Austin', 'TX'), (3, 'Downtown', 'Austin', 'TX'), (4, 'Downtown', 'Austin', 'TX'), (5, 'Midtown', 'Austin', 'TX')])
Expected Output: [{'neighborhood': 'Downtown', 'city': 'Austin', 'state': 'TX', 'block_start_timestamp': 1, 'block_end_timestamp': 10}, {'neighborhood': 'Downtown', 'city': 'Austin', 'state': 'TX', 'block_start_timestamp': 12, 'block_end_timestamp': 15}, {'neighborhood': 'Midtown', 'city': 'Austin', 'state': 'TX', 'block_start_timestamp': 2, 'block_end_timestamp': 4}]
Explanation: In Downtown, intervals [1,5], [3,7], and [7,10] overlap or touch, so they merge into [1,10]. The interval [12,15] is separate. Midtown has only one interval [2,4].
Input: ([(10, 0, 2), (11, 2, 3), (12, 5, 6), (13, 1, 4)], [(10, 'Soho', 'New York', 'NY'), (11, 'Soho', 'New York', 'NY'), (12, 'Soho', 'Albany', 'NY'), (13, 'Mission', 'San Francisco', 'CA')])
Expected Output: [{'neighborhood': 'Mission', 'city': 'San Francisco', 'state': 'CA', 'block_start_timestamp': 1, 'block_end_timestamp': 4}, {'neighborhood': 'Soho', 'city': 'Albany', 'state': 'NY', 'block_start_timestamp': 5, 'block_end_timestamp': 6}, {'neighborhood': 'Soho', 'city': 'New York', 'state': 'NY', 'block_start_timestamp': 0, 'block_end_timestamp': 3}]
Explanation: The two Soho parties in New York touch at time 2, so they merge into [0,3]. Soho in Albany is a different location, so it stays separate. Results are sorted by state, city, neighborhood, then block start.
Input: ([], [])
Expected Output: []
Explanation: With no parties, there are no blocks.
Input: ([('a', -5, -1), ('b', -1, 2), ('c', 10, 12), ('d', 11, 13)], [('a', 'Old Town', 'Denver', 'CO'), ('b', 'Old Town', 'Denver', 'CO'), ('c', 'Old Town', 'Denver', 'CO'), ('d', 'Old Town', 'Denver', 'CO')])
Expected Output: [{'neighborhood': 'Old Town', 'city': 'Denver', 'state': 'CO', 'block_start_timestamp': -5, 'block_end_timestamp': 2}, {'neighborhood': 'Old Town', 'city': 'Denver', 'state': 'CO', 'block_start_timestamp': 10, 'block_end_timestamp': 13}]
Explanation: The first two intervals touch at -1, so they merge into [-5,2]. The last two overlap and merge into [10,13].
Input: ([(42, 100, 101)], [(42, 'Center', 'Boston', 'MA')])
Expected Output: [{'neighborhood': 'Center', 'city': 'Boston', 'state': 'MA', 'block_start_timestamp': 100, 'block_end_timestamp': 101}]
Explanation: A single party forms a single block.
Solution
def solution(parties, party_locations):
from collections import defaultdict
location_by_party = {}
for party_id, neighborhood, city, state in party_locations:
location_by_party[party_id] = (state, city, neighborhood)
intervals_by_group = defaultdict(list)
for party_id, start_timestamp, end_timestamp in parties:
state, city, neighborhood = location_by_party[party_id]
intervals_by_group[(state, city, neighborhood)].append((start_timestamp, end_timestamp))
result = []
for state, city, neighborhood in sorted(intervals_by_group.keys()):
intervals = sorted(intervals_by_group[(state, city, neighborhood)], key=lambda x: (x[0], x[1]))
current_start = None
current_end = None
for start_timestamp, end_timestamp in intervals:
if current_start is None:
current_start = start_timestamp
current_end = end_timestamp
elif start_timestamp <= current_end:
if end_timestamp > current_end:
current_end = end_timestamp
else:
result.append({
'neighborhood': neighborhood,
'city': city,
'state': state,
'block_start_timestamp': current_start,
'block_end_timestamp': current_end
})
current_start = start_timestamp
current_end = end_timestamp
if current_start is not None:
result.append({
'neighborhood': neighborhood,
'city': city,
'state': state,
'block_start_timestamp': current_start,
'block_end_timestamp': current_end
})
return resultTime complexity: O(n log n). Space complexity: O(n).
Hints
- First join the two datasets by `party_id`, then group intervals by the exact location `(state, city, neighborhood)`.
- For each location, sort intervals by start time and merge while the next interval starts at or before the current merged end.