Implement Disease and Friend Snapshot Models
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
You are asked to solve two independent coding problems.
### Problem 1: Simulate disease spread on a contact graph
There are `n` people labeled `0` through `n - 1`. An undirected edge between two people means the disease can spread between them. Time advances in whole days, and all state transitions for a day are based on the state at the start of that day unless otherwise stated.
Implement the following follow-ups:
1. `daysToInfectAll(n, edges, initiallyInfected)`: Initially, every person is susceptible except the initially infected people. Each day, every infected person infects all susceptible neighbors. Return the minimum number of days until everyone is infected. Return `-1` if some people can never be infected.
2. `daysToFinishWithImmune(n, edges, initiallyInfected, initiallyImmune)`: Some people are initially immune. Immune people cannot be infected and do not transmit the disease. Return how many days it takes until no new infection can occur. Also be prepared to report which susceptible people remain uninfected.
3. `daysToStableAfterRecovery(n, edges, initiallyInfected, initiallyImmune, recoveryDays)`: An infected person remains contagious for `recoveryDays` days after becoming infected, then becomes immune at the end of that day. Return the first day when the system is stable, meaning no infected people remain and no future infections are possible.
4. Extend the model with a `Dead` state. The input provides either a per-person terminal outcome or a deterministic predicate such as `willDie(person)`, plus the number of days before the terminal transition. Dead people cannot be infected and cannot transmit. Return the day the system becomes stable and the final count of susceptible, infected, immune, and dead people.
Clarify edge cases such as an empty initial infection set, disconnected components, simultaneous infection and recovery, and whether recovery or death happens before or after that day’s transmission.
### Problem 2: Implement a friend-circle snapshot data structure
Implement a class `FriendCircle` supporting directed follow relationships between users.
Required operations:
1. `follow(userId, targetId)`: `userId` follows `targetId`.
2. `unfollow(userId, targetId)`: `userId` stops following `targetId` if the relationship exists.
3. `snapshot()`: Capture an immutable snapshot of the current follow graph and return a snapshot id.
4. `getFriends(snapshotId, userId)`: Given a snapshot id, return all users followed by `userId` at that snapshot.
5. `recommendFriends(snapshotId, userId, k)`: Given a snapshot id, recommend up to `k` users that `userId` does not already follow. A reasonable ranking rule is to recommend two-hop candidates: if `userId` follows `a` and `a` follows `x`, then `x` is a candidate. Rank candidates by the number of mutual intermediate users, then by user id or another deterministic tie-breaker.
6. `diffFriends(snapshotIdA, snapshotIdB, userId)`: Compare two snapshots and return the users added to and removed from `userId`’s follow list.
Be prepared to discuss API design, whether snapshot-specific logic belongs in `FriendCircle` or a separate `Snapshot` abstraction, time and space complexity, and how to avoid copying the entire graph on every snapshot.
Quick Answer: This question evaluates skills in graph-based modeling and temporal state simulation as well as the design and implementation of snapshot-capable directed graph data structures, covering competencies in propagation and state transitions (including immunity, recovery, and death), efficient snapshotting, querying, ranking, and diffing of follow relationships. It is commonly asked to assess algorithmic reasoning, correctness under edge cases, and complexity-aware implementation; the problem belongs to the Coding & Algorithms domain and tests both conceptual understanding of dynamic graph behavior and practical application-level implementation of data structures and immutable snapshots.
Part 1: Disease spread - daysToInfectAll
There are `n` people labeled `0` to `n - 1` and an undirected contact graph given by `edges`. Initially, only the people in `initiallyInfected` are infected; everyone else is susceptible. Infected people never recover in this version. Day 0 is the initial state. On each day, every person who is infected at the start of that day infects all susceptible neighbors. People infected during a day can start infecting others on the next day. Return the minimum number of days needed until everyone is infected, or `-1` if some people can never be infected. If `n == 0`, return `0`.
Constraints
- `0 <= n <= 2 * 10^5`
- `0 <= len(edges) <= 2 * 10^5`
- `0 <= u, v < n` for every edge
- `0 <= person < n` for every initially infected person
- The graph may be disconnected, and duplicate edges may appear
Examples
Input: (5, [(0, 1), (1, 2), (2, 3), (3, 4)], [0])
Expected Output: 4
Explanation: The infection moves one hop per day along the chain.
Input: (6, [(0, 1), (1, 2), (3, 4)], [0])
Expected Output: -1
Explanation: People 3, 4, and 5 are not reachable from the infected component.
Input: (4, [(0, 1), (1, 2), (2, 3)], [0, 3])
Expected Output: 1
Explanation: People 1 and 2 are infected on day 1, so everyone is infected after 1 day.
Input: (0, [], [])
Expected Output: 0
Explanation: No people means the system is already fully infected vacuously.
Input: (3, [(0, 1), (1, 2)], [])
Expected Output: -1
Explanation: With no initial infection and existing susceptible people, the disease never starts.
Hints
- Model the initial infected people as multiple BFS sources starting at distance 0.
- If any node is unreachable from all initially infected nodes, the answer is `-1`.
Part 2: Disease spread with immune people - daysToFinishWithImmune
This version adds immune people. There are `n` people labeled `0` to `n - 1` and an undirected contact graph given by `edges`. Initially, people in `initiallyInfected` are infected, people in `initiallyImmune` are immune, and everyone else is susceptible. Immune people cannot be infected and do not transmit the disease. Infected people never recover in this version. Day 0 is the initial state. On each day, every person who is infected at the start of that day infects all susceptible neighbors. Return both: `(1)` how many days it takes until no new infection can occur, and `(2)` the sorted list of people who started susceptible and remain uninfected forever. If a person appears in both initial lists, treat them as immune.
Constraints
- `0 <= n <= 2 * 10^5`
- `0 <= len(edges) <= 2 * 10^5`
- `0 <= u, v < n` for every edge
- `0 <= person < n` for every listed person
- The graph may be disconnected
Examples
Input: (5, [(0, 1), (1, 2), (2, 3), (3, 4)], [0], [3])
Expected Output: (2, [4])
Explanation: The infection reaches 1 on day 1 and 2 on day 2. Person 3 is immune and blocks access to 4.
Input: (4, [(0, 1), (1, 2), (2, 3)], [0], [])
Expected Output: (3, [])
Explanation: Everyone is eventually infected along the chain.
Input: (4, [(0, 1)], [], [2])
Expected Output: (0, [0, 1, 3])
Explanation: No one is infected initially, so the process is already finished on day 0.
Input: (3, [(0, 1), (1, 2)], [0], [0])
Expected Output: (0, [1, 2])
Explanation: Immunity wins for person 0, leaving no active infection source.
Hints
- Ignore immune nodes when you traverse the graph; they block spread completely.
- The day count is the maximum BFS level among people who ever become infected.
Part 3: Disease spread with recovery to immunity - daysToStableAfterRecovery
There are `n` people labeled `0` to `n - 1` and an undirected contact graph given by `edges`. Initially, people in `initiallyInfected` are infected, people in `initiallyImmune` are immune, and everyone else is susceptible. Immune people cannot be infected and do not transmit. Initially infected people are infected at the start of day 0. During each day, every person infected at the start of that day infects all susceptible neighbors; newly infected people can start infecting others on the next day. Each infected person stays infected and contagious for exactly `recoveryDays` days counted from the first day they are able to transmit, then becomes immune at the end of that last contagious day. Return the first day when the system is stable: no infected people remain and no future infections are possible. If a person appears in both initial lists, treat them as immune.
Constraints
- `0 <= n <= 2 * 10^5`
- `0 <= len(edges) <= 2 * 10^5`
- `1 <= recoveryDays <= 10^9`
- `0 <= u, v < n` for every edge
- `0 <= person < n` for every listed person
Examples
Input: (3, [(0, 1), (1, 2)], [0], [], 2)
Expected Output: 4
Explanation: Infection days are 0, 1, and 2. The last infected person recovers at the end of day 3, so stability begins on day 4.
Input: (4, [(0, 1), (1, 2), (2, 3)], [0], [2], 3)
Expected Output: 4
Explanation: Only people 0 and 1 are ever infected; person 1 is infected on day 1 and remains infected through day 3.
Input: (5, [(0, 1)], [], [], 2)
Expected Output: 0
Explanation: With no active infection source, the system is already stable.
Input: (2, [], [1], [], 1)
Expected Output: 1
Explanation: Person 1 is infected on day 0, recovers at the end of that day, and the system is stable on day 1.
Hints
- First find the day each reachable person becomes infected using multi-source BFS.
- Once you know the latest infection day, each infected person simply stays active for `recoveryDays` more days.
Part 4: Disease spread with Dead state and terminal outcomes
Extend the model with a terminal outcome for each person. There are `n` people labeled `0` to `n - 1` and an undirected contact graph given by `edges`. Initially, people in `initiallyInfected` are infected, people in `initiallyImmune` are immune, and everyone else is susceptible. Immune and dead people cannot be infected and do not transmit. Initially infected people are infected at the start of day 0. During each day, every person infected at the start of that day infects all susceptible neighbors; newly infected people can start infecting others on the next day. If person `i` is ever infected, then after exactly `terminalDays` contagious days they transition at the end of that day to the terminal outcome given by `terminalOutcomes[i]`, which is either `'immune'` or `'dead'`. Return a list in this exact order: `[stable_day, susceptible_count, infected_count, immune_count, dead_count]`, where `stable_day` is the first day when no infected people remain and no future infections are possible. If a person appears in both initial lists, treat them as immune.
Constraints
- `0 <= n <= 2 * 10^5`
- `0 <= len(edges) <= 2 * 10^5`
- `1 <= terminalDays <= 10^9`
- `len(terminalOutcomes) == n`
- Each `terminalOutcomes[i]` is either `'immune'` or `'dead'`
Examples
Input: (4, [(0, 1), (1, 2), (2, 3)], [0], [], ['immune', 'dead', 'immune', 'dead'], 2)
Expected Output: [5, 0, 0, 2, 2]
Explanation: All four people are infected eventually. People 0 and 2 end immune; 1 and 3 end dead.
Input: (5, [(0, 1), (1, 2), (2, 3), (3, 4)], [0], [2], ['immune', 'immune', 'immune', 'immune', 'immune'], 1)
Expected Output: [2, 2, 0, 3, 0]
Explanation: People 0 and 1 are infected, person 2 starts immune, and people 3 and 4 remain susceptible.
Input: (3, [(0, 1)], [], [], ['dead', 'immune', 'dead'], 3)
Expected Output: [0, 3, 0, 0, 0]
Explanation: No initial infection means no state changes at all.
Input: (4, [(0, 1), (2, 3)], [2], [0], ['immune', 'immune', 'dead', 'dead'], 1)
Expected Output: [2, 1, 0, 1, 2]
Explanation: People 2 and 3 end dead, person 0 starts immune, and person 1 stays susceptible.
Hints
- For reachability, the important question is still when each person first becomes infected; use BFS from all active initial sources.
- Once you know who gets infected and the latest infection day, terminal outcomes determine the final counts, and `terminalDays` determines when activity ends.
Part 5: FriendCircle snapshot data structure
Implement the FriendCircle snapshot system as an operation processor. Your function receives a list of operations to run against one mutable directed follow graph. Snapshots are immutable views of the graph. Snapshot ids start at `0` and increase by `1` each time a snapshot is taken. Use these operations:
- `('follow', userId, targetId)`: add a directed edge `userId -> targetId`
- `('unfollow', userId, targetId)`: remove that edge if it exists
- `('snapshot',)`: capture the current graph and return the new snapshot id
- `('get', snapshotId, userId)`: return the sorted users followed by `userId` at that snapshot
- `('recommend', snapshotId, userId, k)`: recommend up to `k` users not already followed by `userId`, using two-hop candidates; if `userId` follows `a` and `a` follows `x`, then `x` is a candidate. Rank by the number of distinct intermediate users, then by smaller user id
- `('diff', snapshotIdA, snapshotIdB, userId)`: compare `userId`'s follow list between two snapshots and return `{'added': [...], 'removed': [...]}` with sorted lists
`follow` is idempotent, `unfollow` of a missing edge is a no-op, and self-follow should be ignored. Return a list containing the outputs of only the operations that produce a value: `snapshot`, `get`, `recommend`, and `diff`, in that order.
Constraints
- `1 <= len(operations) <= 2 * 10^5`
- User ids are non-negative integers
- Snapshot ids used in queries are valid
- `k` in `recommend` may be `0`
- You should avoid copying the entire graph on every snapshot
Examples
Input: ([('follow', 1, 2), ('follow', 1, 3), ('snapshot',), ('unfollow', 1, 2), ('follow', 1, 4), ('snapshot',), ('get', 0, 1), ('get', 1, 1), ('diff', 0, 1, 1)],)
Expected Output: [0, 1, [2, 3], [3, 4], {'added': [4], 'removed': [2]}]
Explanation: Snapshot 0 stores follows {2, 3}; snapshot 1 stores {3, 4}.
Input: ([('follow', 1, 2), ('follow', 1, 3), ('follow', 2, 4), ('follow', 3, 4), ('follow', 3, 5), ('snapshot',), ('recommend', 0, 1, 3), ('recommend', 0, 2, 2)],)
Expected Output: [0, [4, 5], []]
Explanation: For user 1, candidate 4 has two supporting intermediates and 5 has one.
Input: ([('follow', 1, 1), ('unfollow', 1, 2), ('snapshot',), ('get', 0, 1), ('diff', 0, 0, 1)],)
Expected Output: [0, [], {'added': [], 'removed': []}]
Explanation: Self-follow is ignored, and unfollowing a missing edge does nothing.
Input: ([('follow', 0, 1), ('snapshot',), ('follow', 1, 2), ('snapshot',), ('follow', 0, 2), ('unfollow', 1, 2), ('snapshot',), ('get', 0, 0), ('get', 1, 1), ('get', 2, 1), ('recommend', 2, 0, 5)],)
Expected Output: [0, 1, 2, [1], [2], [], []]
Explanation: Earlier snapshots remain unchanged even after later follows and unfollows.
Hints
- Instead of cloning the whole graph on each snapshot, store per-user history only when that user's follow set changed since the previous snapshot.
- To answer recommendations, fetch the direct follow set at the snapshot, then count two-hop candidates from those users.