Parse strings and find meeting slots
Company: Google
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
##### Question
Given a text file where each line contains raw information, extract the user identifier and the numeric quantity on that line and store the results efficiently.
Given multiple people's calendars (each as a list of available days), return the days when everyone is available for a meeting. Follow-ups:
(a) return the days when at least P people are available;
(b) given an integer X, return time periods of consecutive days of length X when the availability condition holds.
Quick Answer: This question evaluates string parsing and data-extraction skills alongside set/array-based aggregation and algorithmic reasoning for calendar availability and time-window constraints.
You are given two inputs: (1) lines: a list of strings, each containing exactly one user identifier and exactly one signed integer; (2) calendars: a dictionary mapping user names to a list of integers representing the days that user is available. A user identifier in a line matches the pattern '@' followed by one or more characters from [A-Za-z0-9_]. Extract the user name (without the leading '@') and the integer from each line and compute the sum of integers per user. For the calendars, compute the set of days on which at least p users are available (duplicates within a user's list count once). Then, given an integer x, return all contiguous periods [start, end] of exact length x such that every day in [start, end] is in that set of days. Return a dictionary with: user_sums (mapping user->sum), days (sorted list of days with at least p users), and periods (sorted list of [start, end] segments of length x where availability condition holds).
Constraints
- 1 <= len(lines) <= 200000
- Each line contains exactly one user identifier of the form '@[A-Za-z0-9_]+' and exactly one signed integer (e.g., -7, +3, 42)
- 1 <= len(calendars) <= 100000
- Let T be the total number of day entries across all users in calendars; 0 <= T <= 200000
- Day values are integers in [1, 10^9]
- Duplicates in a single user's day list are allowed but count once for that user
- 1 <= p <= len(calendars)
- 1 <= x <= 10^6
- Output 'days' must be sorted ascending without duplicates
- Output 'periods' must be sorted by start ascending; each period is [start, end] inclusive with end = start + x - 1
Hints
- Use a regular expression to extract '@username' and the (signed) integer from each line; strip the leading '@' from the username.
- Aggregate per-user sums in a dictionary for O(1) average updates.
- Convert each user's availability list to a set to remove duplicates before counting days.
- Count availability per day across users, then collect days with count >= p and sort them.
- Scan the sorted valid days to find consecutive runs and emit all length-x windows within each run.