Parse strings and find meeting slots
Company: Google
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates string parsing and data-extraction skills alongside set/array-based aggregation and algorithmic reasoning for calendar availability and time-window constraints.
Constraints
- 1 <= len(lines) <= 200000
- Each line contains exactly one user identifier of the form '@[A-Za-z0-9_]+' and exactly one signed integer (e.g., -7, +3, 42)
- 1 <= len(calendars) <= 100000
- Let T be the total number of day entries across all users in calendars; 0 <= T <= 200000
- Day values are integers in [1, 10^9]
- Duplicates in a single user's day list are allowed but count once for that user
- 1 <= p <= len(calendars)
- 1 <= x <= 10^6
- Output 'days' must be sorted ascending without duplicates
- Output 'periods' must be sorted by start ascending; each period is [start, end] inclusive with end = start + x - 1
Solution
import re
def analyze_availability(lines, calendars, p, x):
user_sums = {}
user_re = re.compile(r"@([A-Za-z0-9_]+)")
num_re = re.compile(r"[+-]?\d+")
for line in lines:
um = user_re.search(line)
nm = num_re.search(line)
if not um or not nm:
# Per constraints, each line contains both; safeguard just in case
continue
user = um.group(1)
qty = int(nm.group(0))
user_sums[user] = user_sums.get(user, 0) + qty
# Count days with availability >= p
counts = {}
for days in calendars.values():
seen = set(days)
for d in seen:
counts[d] = counts.get(d, 0) + 1
valid_days = [d for d, c in counts.items() if c >= p]
valid_days.sort()
periods = []
if x >= 1 and valid_days:
start_idx = 0
n = len(valid_days)
for i in range(1, n + 1):
end_of_run = (i == n) or (valid_days[i] != valid_days[i - 1] + 1)
if end_of_run:
run_len = i - start_idx
if run_len >= x:
start_day = valid_days[start_idx]
for offset in range(run_len - x + 1):
s = start_day + offset
periods.append([s, s + x - 1])
if i < n:
start_idx = i
return {"user_sums": user_sums, "days": valid_days, "periods": periods}
Explanation
Time complexity: O(L + T + D log D + W), where L = number of lines, T = total day entries across calendars, D = number of distinct days with at least one available user, and W = number of output windows. Space complexity: O(U + D), where U = number of distinct users seen in lines and D = number of distinct days across calendars.
Hints
- Use a regular expression to extract '@username' and the (signed) integer from each line; strip the leading '@' from the username.
- Aggregate per-user sums in a dictionary for O(1) average updates.
- Convert each user's availability list to a set to remove duplicates before counting days.
- Count availability per day across users, then collect days with count >= p and sort them.
- Scan the sorted valid days to find consecutive runs and emit all length-x windows within each run.