Implement Calendar, Tokenizer, and Meeting Optimizer
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates competencies in interval scheduling and overlap detection, custom string tokenization and parsing rules, and combinatorial optimization using subset/bitmask dynamic programming, reflecting algorithmic problem solving, edge-case handling, and efficient data-structure use.
Part 1: Calendar Insertion Check
Constraints
- 0 <= len(events) <= 200000
- Each interval satisfies 0 <= start <= end <= 10^9
- Existing events are pairwise non-overlapping
- The events list may be unsorted
- A zero-length event [t, t) is allowed
Examples
Input: ([], [2, 5])
Expected Output: True
Explanation: An empty calendar can always accept the new event.
Input: ([[1, 3], [5, 7]], [3, 5])
Expected Output: True
Explanation: The new event only touches existing events at endpoints, which is allowed for half-open intervals.
Input: ([[1, 4], [6, 8]], [2, 5])
Expected Output: False
Explanation: The interval [2, 5) overlaps with [1, 4).
Input: ([[10, 12], [1, 3], [5, 7]], [7, 10])
Expected Output: True
Explanation: Even though the existing events are unsorted, [7, 10) fits exactly between [5, 7) and [10, 12).
Input: ([[2, 4]], [4, 4])
Expected Output: True
Explanation: A zero-length half-open interval is empty and does not overlap.
Hints
- Two half-open intervals [a, b) and [c, d) overlap exactly when a < d and c < b.
- You do not need to merge or sort the intervals to answer this question; comparing the new event against each existing event is enough.
Part 2: Mixed-Delimiter String Tokenizer
Constraints
- 0 <= len(s) <= 100000
- s contains only lowercase letters, spaces, and '*'
Examples
Input: 'a bc d'
Expected Output: ['a', 'bc', 'd']
Explanation: There is no '*', so split on runs of spaces.
Input: 'a*a aa*bb'
Expected Output: ['a', 'a aa', 'bb']
Explanation: Because '*' exists, split on '*'. Internal spaces inside 'a aa' are preserved.
Input: 'a *bd**c'
Expected Output: ['a', 'bd', 'c']
Explanation: The empty segment created by '**' is discarded, and surrounding spaces are trimmed.
Input: ''
Expected Output: []
Explanation: An empty string produces no tokens.
Input: ' * ** x * '
Expected Output: ['x']
Explanation: After splitting on '*', trimming removes surrounding spaces and all empty segments are ignored.
Hints
- For the no-'*' case, Python's split() with no argument already handles runs of spaces and ignores empties.
- For the '*' case, first split on '*', then trim each part and discard the empty ones.
Part 3: Investor Meeting Optimization
Constraints
- 0 <= len(availability) <= 100000
- 0 <= k <= 20
- The total number of distinct day labels across all investors is at most 20
- Day labels are integers and do not need to be consecutive
- availability[i] may be empty
Examples
Input: ([[1, 2], [2], [3]], 1)
Expected Output: 2
Explanation: Choosing day 2 meets the first two investors.
Input: ([[1], [2], [1, 2], [3]], 2)
Expected Output: 3
Explanation: Choosing days 1 and 2 meets investors 0, 1, and 2.
Input: ([[1, 3], [2], [3], []], 0)
Expected Output: 0
Explanation: With zero chosen days, no investors can be met.
Input: ([], 3)
Expected Output: 0
Explanation: There are no investors to meet.
Input: ([[1, 2], [2, 3], [3, 4], [1, 4], [2], [4]], 2)
Expected Output: 6
Explanation: Choosing days 2 and 4 covers every investor.
Hints
- Map each distinct day to a bit position, and convert each investor's availability into a bitmask.
- For a chosen set of days S, it can be easier to count investors not covered by S: those whose availability is entirely contained in the unchosen days.