Implement follow graph with snapshots and recommendations
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Design and implement an in-memory “social network” component that supports following/unfollowing, point-in-time (snapshot) queries, and a simple recommendation API.
You are given users identified by integers `1..N` (assume `N` can be large). Implement the following operations:
1. `follow(u, v)`: user `u` starts following user `v`.
2. `unfollow(u, v)`: user `u` stops following user `v`.
3. `snap() -> sid`: creates a snapshot of the current follow graph and returns a monotonically increasing snapshot id `sid`.
4. `isFollowing(u, v, sid) -> bool`: returns whether `u` was following `v` at the time snapshot `sid` was taken.
5. `recommend(u, k) -> list[v]`: recommend up to `k` users that `u` does **not** currently follow (and excluding `u`). Rank candidates by the number of users that `u` currently follows who also follow the candidate (i.e., “friends-of-friends” count). Break ties by smaller user id.
Clarifications/assumptions you should state and handle:
- Repeated `follow(u, v)` when already following should be a no-op; similarly `unfollow` when not following.
- Snapshots are immutable; `isFollowing` must reflect state at that snapshot.
- Analyze time and space complexity of your approach, especially for large numbers of operations and snapshots.
Quick Answer: This question evaluates design and implementation skills for mutable graph data structures with point-in-time snapshotting and recommendation ranking, testing knowledge of data structures, versioning, graph traversal, and algorithmic time/space complexity analysis.
Part 1: Basic Follow/Unfollow Graph
Implement a simple **directed follow graph** for a small social network and answer membership queries against it.
There are `n` users labeled `0` to `n - 1`. You are given a list of `operations`, each a tuple `(kind, a, b)` where `a` and `b` are user ids. Process the operations **in order**, maintaining for every user the set of users they currently follow (an edge `a → b` means "`a` follows `b`").
## Function
```python
def solution(n, operations):
...
```
- **`n`** — the number of users.
- **`operations`** — a list of tuples; each is one of the three kinds below.
- **Return** — a list of booleans, one for each `check` operation, in the order those `check` operations appear.
## Operations
- **`('follow', a, b)`** — user `a` starts following user `b` (add the edge `a → b`).
- **`('unfollow', a, b)`** — user `a` stops following user `b` (remove the edge `a → b`).
- **`('check', a, b)`** — append `True` if user `a` is **currently** following user `b`, otherwise `False`.
## Rules
- **No self-follow.** A user cannot follow themself. A `follow` with `a == b` is **ignored** (no edge is created). Consequently, a `check` with `a == b` always returns `False`.
- **Idempotent follow.** Repeating `follow` on a relation that already exists does nothing.
- **No-op unfollow.** `unfollow` on a relation that does not exist does nothing.
## Output
Return the answers to all `check` operations, in the order they were processed. If there are no `check` operations, return an empty list.
Constraints
- 0 <= n <= 10^5
- 0 <= len(operations) <= 2 * 10^5
- All user ids in operations are integers in the range [0, n - 1]
- Average O(1) set operations are acceptable
Examples
Input: (4, [('follow', 0, 1), ('check', 0, 1), ('unfollow', 0, 1), ('check', 0, 1), ('follow', 0, 0), ('check', 0, 0)])
Expected Output: [True, False, False]
Explanation: User 0 first follows 1, then unfollows 1. Self-follow is ignored, so checking (0, 0) is False.
Input: (3, [('follow', 1, 2), ('follow', 1, 2), ('check', 1, 2), ('unfollow', 1, 2), ('unfollow', 1, 2), ('check', 1, 2)])
Expected Output: [True, False]
Explanation: Repeated follow and unfollow operations should not break the state.
Input: (2, [])
Expected Output: []
Explanation: Edge case: no operations means no check results.
Input: (5, [('follow', 0, 1), ('follow', 0, 2), ('follow', 2, 1), ('check', 0, 2), ('check', 2, 0), ('unfollow', 0, 2), ('check', 0, 2)])
Expected Output: [True, False, False]
Explanation: Checks should always reflect the current graph after each update.
Hints
- A hash set for each user is enough to represent who they currently follow.
- Make `follow` and `unfollow` idempotent so repeated operations do not cause errors.
Part 2: Follow Graph with Snapshots and Historical Queries
Maintain a **directed follow graph** that supports **immutable snapshots** and **historical queries** against those snapshots.
There are `n` users labeled from `0` to `n - 1`. You are given a list of `operations` to process **in order**. Each operation is a tuple (or list) whose first element is a string naming the operation:
## Operations
- `('follow', a, b)` — user `a` starts following user `b`. The relationship is **directed**: `a` following `b` does **not** mean `b` follows `a`.
- `('unfollow', a, b)` — user `a` stops following user `b`.
- `('snapshot',)` — save the current state of the follow graph and **return its snapshot id**.
- `('check', snap_id, a, b)` — ask whether user `a` was following user `b` **in the saved state identified by** `snap_id`. Returns a boolean.
## Rules
- **Snapshot ids** start at `0` and increase by `1` each time `snapshot` is called (the first snapshot gets id `0`, the second `1`, and so on).
- A `snapshot` captures every `follow`/`unfollow` applied **before** it is taken. Snapshots are **immutable**: any `follow`/`unfollow` performed *after* a snapshot leaves that older snapshot's state unchanged.
- **No self-follows.** A `follow` where `a == b` is **ignored** (no state change). Likewise, a `check` where `a == b` always returns `False`.
- **Idempotency.** Repeated `follow` operations on a pair that is already following, or repeated `unfollow` operations on a pair that is not following, have no effect. Following → unfollowing → following again leaves the pair in the *following* state.
- Every `check` is **guaranteed** to use a `snap_id` that was previously returned by a `snapshot` call.
## Return value
Return a list containing the result of every `snapshot` and every `check`, **in the order the operations appear**:
- For each `snapshot`, append its **snapshot id** (an integer).
- For each `check`, append a **boolean** (`True` if `a` was following `b` in that snapshot, otherwise `False`).
`follow` and `unfollow` operations produce no output.
## Function signature
```python
def solution(n, operations):
...
```
## Examples
- Operations `[('follow', 0, 1), ('snapshot',), ('unfollow', 0, 1), ('snapshot',), ('check', 0, 0, 1), ('check', 1, 0, 1)]` with `n = 3` return `[0, 1, True, False]`. Snapshot `0` is taken while `0` follows `1` (→ `True`); snapshot `1` is taken after the unfollow (→ `False`).
- Operations `[('follow', 0, 1), ('unfollow', 0, 1), ('follow', 0, 1), ('snapshot',), ('check', 0, 0, 1)]` with `n = 2` return `[0, True]`, since the final state before the snapshot has `0` following `1`.
- Operations `[('follow', 0, 0), ('snapshot',), ('check', 0, 0, 0)]` with `n = 2` return `[0, False]` — the self-follow is ignored and the self `check` is `False`.
## Constraints
- `0 <= n <= 10^5`
- `0 <= len(operations) <= 2 * 10^5`
- All user ids are in `[0, n - 1]`.
- Each `check` uses a valid snapshot id that has already been created.
Constraints
- 0 <= n <= 10^5
- 0 <= len(operations) <= 2 * 10^5
- All user ids are in [0, n - 1]
- Each `check` uses a valid snapshot id that has already been created
Examples
Input: (3, [('follow', 0, 1), ('snapshot',), ('unfollow', 0, 1), ('snapshot',), ('check', 0, 0, 1), ('check', 1, 0, 1)])
Expected Output: [0, 1, True, False]
Input: (2, [('follow', 0, 1), ('unfollow', 0, 1), ('follow', 0, 1), ('snapshot',), ('check', 0, 0, 1)])
Expected Output: [0, True]
Input: (1, [])
Expected Output: []
Input: (2, [('follow', 0, 0), ('snapshot',), ('check', 0, 0, 0)])
Expected Output: [0, False]
Input: (3, [('follow', 1, 2), ('snapshot',), ('snapshot',), ('check', 1, 1, 2)])
Expected Output: [0, 1, True]
Hints
- You do not need to copy the whole graph at every snapshot. Store only when a specific follow relation changes.
- For each pair `(a, b)`, keep a sorted history of `(snapshot_id, state)` and binary-search it during historical queries.
Part 3: Recommend Top-K Users by Two-Hop Follow Count
Build a **friend-recommendation engine** on a directed follow graph, processing a stream of operations and returning a recommendation list for each query.
There are `n` users labeled `0` to `n - 1`. The follow graph is **directed**: "user `a` follows user `b`" does not imply `b` follows `a`.
## Function
Implement `solution(n, operations)`:
- `n` — the number of users.
- `operations` — a list of tuples, each one of three kinds, applied **in order**.
Return a **list of lists**: one recommendation list for every `('recommend', ...)` operation, in the order those operations appear. `follow` and `unfollow` operations produce no output.
## Operations
- **`('follow', a, b)`** — user `a` starts following user `b`. Following someone you already follow has no effect.
- **`('unfollow', a, b)`** — user `a` stops following user `b`. Unfollowing someone you don't follow has no effect.
- **`('recommend', u, k)`** — produce up to `k` recommended users for user `u` (see below).
## Recommendation rule
For a `recommend(u, k)` query, score every candidate user `c` by counting **two-hop follow paths** `u → x → c`:
- A candidate `c` earns **1 point** for each user `x` such that:
- `u` currently follows `x`, **and**
- `x` currently follows `c`.
- Equivalently, `c`'s **score** is the number of users `u` already follows who in turn follow `c`.
Apply these filters to the candidates:
- **Exclude `u`** — never recommend a user to themselves.
- **Exclude users `u` already follows** — only recommend new connections.
- **Keep only positive scores** — a candidate with score `0` (no qualifying two-hop path) is not recommended.
## Ranking
Sort the surviving candidates and return at most `k` of them:
- **Higher score first.**
- **Ties broken by smaller user id first.**
Return the first `k` ids in this order. If fewer than `k` candidates qualify, return all of them.
## Edge cases
- **Self-follow is ignored.** A `('follow', a, a)` (and likewise `('unfollow', a, a)`) is a no-op and contributes no edge to the graph.
- **`k <= 0`** yields an empty recommendation list `[]` for that query.
- If no candidate has a positive score, the recommendation list is `[]`.
## Constraints
- `0 <= n <= 10^5`
- `0 <= len(operations) <= 2 * 10^5`
- All user ids are in `[0, n - 1]`
Constraints
- 0 <= n <= 10^5
- 0 <= len(operations) <= 2 * 10^5
- All user ids are in [0, n - 1]
- Self-follow should be ignored
- If fewer than k candidates have positive score, return all of them
Examples
Input: (5, [('follow', 0, 1), ('follow', 0, 2), ('follow', 1, 3), ('follow', 2, 3), ('follow', 2, 4), ('recommend', 0, 2)])
Expected Output: [[3, 4]]
Explanation: User 3 gets score 2 because both followed users 1 and 2 follow 3. User 4 gets score 1.
Input: (6, [('follow', 0, 1), ('follow', 0, 2), ('follow', 1, 4), ('follow', 2, 3), ('recommend', 0, 2)])
Expected Output: [[3, 4]]
Explanation: Users 3 and 4 both have score 1, so the smaller id 3 comes first.
Input: (5, [('follow', 0, 1), ('follow', 1, 2), ('follow', 1, 3), ('recommend', 0, 3), ('unfollow', 0, 1), ('recommend', 0, 3)])
Expected Output: [[2, 3], []]
Explanation: After unfollowing 1, user 0 has no two-hop recommendation source left.
Input: (3, [('recommend', 0, 2), ('follow', 0, 1), ('recommend', 0, 0)])
Expected Output: [[], []]
Explanation: Edge cases: no followees means no recommendations, and k = 0 should return an empty list.
Input: (3, [('follow', 1, 1), ('follow', 0, 1), ('recommend', 0, 2)])
Expected Output: [[]]
Explanation: Self-follow is ignored, so user 1 contributes no second-hop candidates.
Hints
- For a recommendation query on user `u`, iterate through everyone `u` follows, then through each of their follow sets, and count candidate frequencies.
- After counting, sort candidates by `(-score, user_id)` and take the first `k`.