1) Given a list of closed-open time intervals [start, end) representing meetings on a single calendar, compute the minimum number of rooms required so that no meetings in the same room overlap. Implement an algorithm, justify its time and space complexity, and discuss how you would handle very large inputs.
2) Given the root of a binary tree, determine whether it is a complete binary tree (all levels filled except possibly the last, which is filled from left to right). Provide an algorithm, analyze complexity, and describe edge cases and test cases.
Quick Answer: This question evaluates understanding of interval overlap and resource allocation for meeting-room scheduling alongside recognition of binary tree structure and completeness properties.
Solution
# Solution
Two independent problems. Each is solved below with a reference implementation, a complexity argument, edge cases, and a scaling discussion.
---
## Part 1 — Minimum Meeting Rooms
### Key insight
You do **not** need to assign meetings to rooms. The minimum number of rooms equals the **maximum number of meetings that are active at the same instant** (peak concurrency). If at some moment $k$ meetings overlap, you need at least $k$ rooms; conversely, $k = \max$ concurrency rooms always suffice, because at any instant no more than $k$ meetings compete for a room.
Because intervals are **half-open** `[start, end)`, a meeting ending exactly at time `t` frees its room for a meeting starting at `t` — they do not conflict. This drives the tie-break: at a shared timestamp, process **ends before starts**.
### Algorithm 1 — Sweep line (event sort)
Turn each interval into two events, sort, and sweep a running counter.
```python
def min_meeting_rooms(intervals):
# Each interval -> (time, delta). End events use delta -1, start events +1.
# Tie-break: process ends (-1) before starts (+1) at the same time, because
# [start, end) means a meeting ending at t does NOT collide with one starting at t.
events = []
for start, end in intervals:
events.append((start, 1))
events.append((end, -1))
# Sort by time, then by delta ascending so -1 (end) comes before +1 (start).
events.sort(key=lambda e: (e[0], e[1]))
rooms = 0
peak = 0
for _, delta in events:
rooms += delta
if rooms > peak:
peak = rooms
return peak
```
### Algorithm 2 — Two pointers over sorted starts/ends
An alternative with slightly less allocation. **Precondition: every interval must satisfy `start < end` (strictly positive length).** Zero-length intervals `[t, t)` violate this precondition and cause an `IndexError` in the loop below (explained in the edge-case section). If zero-length inputs are possible, either filter them out beforehand or use Algorithm 1 instead.
```python
def min_meeting_rooms_two_pointer(intervals):
if not intervals:
return 0
# Filter or reject zero-length intervals before calling this function.
# They contribute nothing to concurrency, but a [t, t) entry makes starts[i] == ends[j]
# trigger the else branch (j += 1) and can advance j past the end of the array.
starts = sorted(s for s, _ in intervals)
ends = sorted(e for _, e in intervals)
n = len(intervals)
rooms = peak = 0
i = j = 0
while i < n:
if starts[i] < ends[j]: # strict '<' encodes the half-open [start, end) rule:
rooms += 1 # a start at t does NOT need a room freed by an end at t
peak = max(peak, rooms)
i += 1
else:
rooms -= 1 # an end at or before this start frees a room
j += 1
return peak
```
> Tie-break check: if a meeting `[10, 20)` ends at 20 and another `[20, 30)` starts at 20, with half-open semantics they share a room. In Algorithm 2, `starts[i] = 20` is **not** `< ends[j] = 20`, so we take the `else` branch and free the room first — correct. If the intervals were *closed* `[start, end]`, you would change the comparison to `<=` (and the sweep tie-break to ends-after-starts) so equal endpoints conflict.
>
> **Algorithm 1 vs. Algorithm 2:** Algorithm 1 (sweep) handles zero-length intervals correctly (the `+1` and `-1` at the same time net to zero after the ends-before-starts tie-break). Algorithm 2 does **not** handle them safely and requires the `start < end` precondition. For input that might contain zero-length meetings, prefer Algorithm 1 or filter zero-length intervals before calling Algorithm 2.
### Complexity
- **Time:** $O(n \log n)$, dominated by the sort. The sweep / two-pointer pass is $O(n)$.
- **Space:** $O(n)$ for the event array (or two arrays of starts/ends). The running counter itself is $O(1)$.
### Edge cases
- Empty input → `0` rooms.
- A single meeting → `1`.
- Identical intervals (all `[0, 10)`) → equals the count (all overlap).
- Back-to-back meetings `[0,5), [5,10), [10,15)` → `1` room (half-open, no conflict at shared endpoints).
- Zero-length meeting `[5, 5)`: under half-open semantics it occupies no time; the `+1` and `-1` at `t=5` net to zero with the ends-before-starts tie-break, so it adds no room. Confirm this is the desired behavior with the interviewer.
### Handling very large inputs
- **Tens of millions in memory:** the $O(n \log n)$ sort still works; use primitive arrays / NumPy to cut constant factors and memory. Sorting two `int64` arrays of starts and ends is cache-friendly.
- **Bounded integer time domain** (e.g. minutes in a day, or timestamps within a known range $[0, M)$): skip comparison sorting entirely. Build a **difference array** `diff` of size $M+1$, do `diff[start] += 1; diff[end] -= 1` for each interval, then take a prefix sum and return its maximum. This is $O(n + M)$ time and $O(M)$ space — linear when $M$ is comparable to or smaller than $n$.
- **Does not fit in memory / streaming:** generate the `2n` events, **external merge sort** them to disk (chunk → sort → k-way merge), then run a single streaming sweep that keeps only the running counter and the max — $O(1)$ additional memory after sorting. Because the sweep is associative, it also parallelizes: partition the sorted timeline into ranges, compute per-range `(net_delta, internal_peak, running_base)`, and combine — a standard MapReduce-style scan.
- **Distributed:** the difference-array idea shards naturally by time bucket; each worker sums deltas for its bucket range and you carry a prefix across buckets to recover global concurrency.
---
## Part 2 — Is It a Complete Binary Tree?
### Definition (and what it is *not*)
A binary tree is **complete** when every level except possibly the last is completely filled, and all nodes in the last level are as far left as possible (no gaps). Distinguish from:
- **Full** (a.k.a. proper/strict): every node has 0 or 2 children — says nothing about levels.
- **Perfect:** every level, including the last, is completely filled — a stricter condition than complete.
So *perfect ⟹ complete*, but *complete ⇏ full*.
### Algorithm — BFS with a null sentinel
Do a level-order traversal enqueuing **null** children too. Once you dequeue the first `null`, you have hit the structural "end"; from then on, every dequeued node must also be `null`. If you ever dequeue a real node after seeing a `null`, there is a gap → not complete.
```python
from collections import deque
def is_complete_tree(root):
if root is None:
return True # empty tree is trivially complete
q = deque([root])
seen_null = False
while q:
node = q.popleft()
if node is None:
seen_null = True # found the boundary; nothing real may follow
else:
if seen_null:
return False # a real node after a gap -> not complete
q.append(node.left) # enqueue children including Nones
q.append(node.right)
return True
```
### Why it is correct
In a complete tree, the BFS order visits all real nodes contiguously, followed only by `null`s (the missing slots of the last level and the implicit children of the deepest nodes). The first `null` marks the end of the filled prefix; any real node afterward means the last level (or an earlier one) has a hole that is not at the right end — violating "filled left to right."
### Alternative — heap-index method
Number nodes by the heap index their position implies: root = `1`, and for a node at index `i`, left child = `2i`, right child = `2i+1`. Track the count of nodes `n` and the **maximum index** seen. The tree is complete **iff** `max_index == n` (no index gaps).
```python
def is_complete_tree_index(root):
if root is None:
return True
stack = [(root, 1)]
n = 0
max_index = 0
while stack:
node, idx = stack.pop()
n += 1
if idx > max_index:
max_index = idx
if node.left:
stack.append((node.left, 2 * idx))
if node.right:
stack.append((node.right, 2 * idx + 1))
return max_index == n
```
> Caveat: for a deep, sparse tree the index `2i+1` grows exponentially with depth and can overflow fixed-width integers (in Python it just grows, but in C++/Java guard against `2*idx+1` exceeding `long`). The BFS method has no such risk, so it is the preferred default; mention the index method as the elegant alternative.
### Complexity
- **Time:** $O(n)$ — each node (and each `null` child slot, of which there are $O(n)$) is enqueued/dequeued once.
- **Space:** $O(w)$ where $w$ is the maximum width of the tree (the largest queue size, which is the widest level). For a balanced tree $w = O(n)$ at the bottom level; for a skewed (path-like) tree $w = O(1)$. Worst case $O(n)$.
### Edge cases & test cases
| Case | Tree | Expected |
|------|------|----------|
| Empty | `root = None` | `True` |
| Single node | `1` | `True` |
| Perfect tree | `1; 2,3; 4,5,6,7` | `True` |
| Last level filled left-to-right | `1; 2,3; 4,5` (4,5 are 2's children) | `True` |
| Gap in last level | `1; 2,3; 4,_,6` (2 has only left, 3 has a right) | `False` |
| Missing right child only at the end | `1; 2,3; 4,5,6` | `True` |
| Right child present, left missing | `1; 2,3; _,5` (2 has only a right child) | `False` |
| Left-skewed chain | `1 -> left 2 -> left 3` | `False` (root has a left child but no right child, yet the left subtree descends further — a gap on the right at level 1 with deeper nodes below violates completeness) |
| Right-skewed chain | `1 -> right 2` | `False` (left of root missing while right exists) |
Key discriminators to test: a missing **right** child where everything to its left is filled (still complete) versus a present right child with a missing **left** sibling (not complete), and a node appearing after the first gap (not complete).
---
## Follow-up answers
- **Part 1 — actual room assignment:** sort meetings by start; keep a **min-heap of room end-times**. For each meeting, if the earliest-freeing room's end `<= start`, pop it (reuse that room id); otherwise allocate a new room. Push the meeting's end (and chosen room id). This yields a concrete meeting→room mapping in $O(n \log n)$ time and $O(k)$ heap space where $k$ is the room count — same asymptotic cost as counting, with a small constant-factor increase.
- **Part 1 — many "overlap at time $t$" queries:** precompute the sorted event array with a prefix-sum of deltas; each query is `count = prefix_sum_up_to(t)`, answered in $O(\log n)$ via binary search after $O(n \log n)$ preprocessing.
- **Part 2 — sublinear completeness check:** in a complete tree, the leftmost path gives the height $h$, and the last level is filled left-to-right. Binary-search the index of the last present node on level $h$ by descending from the root (each step is an $O(h)$ existence check), giving $O(\log^2 n)$ work to compute the node count; then verify the count matches the perfect-prefix-plus-last-level constraint. This is the same trick used to count nodes of a complete tree in $O(\log^2 n)$.
- **Part 2 — serialized array form:** if the tree is given as a level-order array with explicit `null` markers, completeness reduces to checking that all non-null entries form a **contiguous prefix** with no `null` appearing before a non-null — a single $O(n)$ scan, no traversal needed.