How do I approach Software Engineering Fundamentals interview questions?

Software Engineering Fundamentals questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master software engineering fundamentals interviews.

What difficulty level is this interview question?

This is a medium difficulty Software Engineering Fundamentals question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at OpenAI during technical interviews.

Design a social network with snapshots | OpenAI Interview Question

Design a social network with snapshots

Company: OpenAI

Role: Software Engineer

Category: Software Engineering Fundamentals

Difficulty: medium

Interview Round: Technical Screen

You are asked to design and implement an in-memory `SocialNetwork` class that supports users following each other and creating **snapshots** of the follow graph. A snapshot is an **immutable view** of all follow relationships at the moment it was created; later mutations to the live network must not change any previously taken snapshot. Implement an API similar to: ```text sn = SocialNetwork() sn.add_user("A") sn.add_user("B") sn.follow("A", "B") snap = sn.create_snapshot() assert snap.is_follow("A", "B") sn.follow("A", "C") # mutates the live graph only assert not snap.is_follow("A", "C") # the old snapshot is unaffected ``` The follow relation is **directed**: `follow(u, v)` means *u follows v* and does not imply *v follows u*. Your task is to design the underlying data structures, choose a snapshot strategy, and implement the three operations below. ### Constraints & Assumptions - In-memory, single-process; no persistence required. - User identifiers are hashable (treat them as strings/ints). - `is_follow`, listing followers/followees, and recommendations all operate **on a given snapshot**, not the live graph. - Snapshots may be taken frequently relative to the number of edges; reads on a snapshot should be cheap. - State the asymptotic time/space cost of each operation and of `create_snapshot` itself. ### Clarifying Questions to Ask - If `u` or `v` does not exist when calling `follow` / `is_follow` / a listing, should the method throw, auto-create the user, or return a benign default (false / empty)? - Is self-follow (`follow(u, u)`) permitted, or rejected? - Are repeated `follow(u, v)` calls idempotent (no duplicate edge, no inflated counts)? - Roughly how many snapshots are expected versus how many users/edges — i.e. is the bottleneck snapshot creation, snapshot memory, or query latency? - Is there an `unfollow` operation, and must snapshots taken before/after it reflect the right state? ### Part 1 — Snapshot follow query Support `snap.is_follow(u, v)` returning whether user `u` follows user `v` **as of that snapshot**. ```hint Data structure The query is membership in *u*'s set of followees. A `Map<User, Set<User>>` keyed by the follower gives $O(1)$ average lookup; decide what to return when `u` is absent from the snapshot. ``` #### What This Part Should Cover - The core adjacency representation chosen and why it makes `is_follow` average $O(1)$. - Clear, consistent behavior when `u` (or `v`) is not present in the snapshot. ### Part 2 — List followers and followees For a given snapshot and user `x`, return: - the **followers** of `x` (everyone who follows `x`), and - the **followees** of `x` (everyone `x` follows). ```hint Avoid scanning the whole graph Returning followers from an out-edge map alone forces an $O(V+E)$ scan. Maintaining a **second** adjacency map of *incoming* edges (`in[v] = followers of v`) makes both listings $O(\text{degree})$, at the cost of updating two maps on every `follow`. ``` #### What This Part Should Cover - Recognizing that efficient follower listing requires either a reverse index or accepting a full-graph scan, and the time/space trade-off between them. - Returning copies or read-only views so callers cannot mutate the snapshot. ### Part 3 — Recommend users to follow Given a snapshot, a user `u`, and an integer `k`, recommend up to `k` users for `u` to follow using a **friends-of-friends** ranking: 1. Take `F`, the set of users `u` already follows. 2. For each `f ∈ F`, look at whom `f` follows. 3. Count, across all of `u`'s followees, how many times each candidate `c` appears. 4. Return the top-`k` candidates by that count, excluding `u` itself and anyone already in `F`. ```hint Where to start This is a 2-hop traversal: iterate `u`'s followees, then *their* followees, accumulating a frequency count in a hash map. Decide the exclusion set (`u`, plus everyone already in `F`) before counting. ``` ```hint Top-k selection You don't need to fully sort all `m` candidates. A size-`k` min-heap keyed by count gives the top-`k` in $O(m \log k)$ vs. $O(m \log m)$ for a full sort. Define a deterministic tie-break (e.g. by user id) so results are reproducible. ``` #### What This Part Should Cover - Correct 2-hop traversal with the right exclusions (self, already-followed) applied before/while counting. - A top-`k` selection strategy with stated complexity and an explicit, deterministic tie-break rule. - Counting cost expressed in terms of followee count and average out-degree, e.g. $O\!\left(\sum_{f \in F} \deg^{+}(f)\right)$. ### What a Strong Answer Covers These dimensions span all three parts. - A coherent **snapshot strategy** (full copy vs. copy-on-write / persistent structures vs. versioned edges) with the trade-offs that decide which fits the stated constraints, and a guarantee that snapshots are truly isolated from later mutations. - Consistent handling of the clarified edge cases (missing users, self-follow, idempotent follows) across every method. - Stated asymptotic time and space for `follow`, `create_snapshot`, and each snapshot query. ### Follow-up Questions - If snapshots are taken very frequently on a large graph, full-copy snapshots become $O(V+E)$ each. How would persistent/copy-on-write data structures or versioned (append-only) edges reduce snapshot cost, and what do they cost you on the read path? - How would you add `unfollow(u, v)` while keeping every previously taken snapshot correct? - How would you make the live network safe under concurrent `follow`/`create_snapshot` from multiple threads? - The recommendation only uses follow-count. How would you incorporate signals like mutual follows, recency, or excluding blocked users without changing the asymptotic cost much?

Quick Answer: This question evaluates understanding of data structures, graph representations, immutability and snapshot/versioning strategies for in-memory systems, along with algorithmic reasoning for query and recommendation operations.

You are asked to design and implement an in-memory SocialNetwork class that supports users following each other and creating snapshots of the follow graph. A snapshot is an immutable view of all follow relationships at the moment it was created; later mutations to the live network must not change any previously taken snapshot.

Implement an API similar to:

sn = SocialNetwork()
sn.add_user("A")
sn.add_user("B")
sn.follow("A", "B")

snap = sn.create_snapshot()
assert snap.is_follow("A", "B")

sn.follow("A", "C")          # mutates the live graph only
assert not snap.is_follow("A", "C")   # the old snapshot is unaffected

The follow relation is directed: follow(u, v) means u follows v and does not imply v follows u. Your task is to design the underlying data structures, choose a snapshot strategy, and implement the three operations below.

Constraints & Assumptions

In-memory, single-process; no persistence required.
User identifiers are hashable (treat them as strings/ints).
is_follow , listing followers/followees, and recommendations all operate on a given snapshot , not the live graph.
Snapshots may be taken frequently relative to the number of edges; reads on a snapshot should be cheap.
State the asymptotic time/space cost of each operation and of create_snapshot itself.

Clarifying Questions to Ask

If u or v does not exist when calling follow / is_follow / a listing, should the method throw, auto-create the user, or return a benign default (false / empty)?
Is self-follow ( follow(u, u) ) permitted, or rejected?
Are repeated follow(u, v) calls idempotent (no duplicate edge, no inflated counts)?
Roughly how many snapshots are expected versus how many users/edges — i.e. is the bottleneck snapshot creation, snapshot memory, or query latency?
Is there an unfollow operation, and must snapshots taken before/after it reflect the right state?

Part 1 — Snapshot follow query

Support snap.is_follow(u, v) returning whether user u follows user v as of that snapshot.

What This Part Should Cover

The core adjacency representation chosen and why it makes is_follow average $O(1)$ .
Clear, consistent behavior when u (or v ) is not present in the snapshot.

Part 2 — List followers and followees

For a given snapshot and user x, return:

the followers of x (everyone who follows x ), and
the followees of x (everyone x follows).

What This Part Should Cover

Recognizing that efficient follower listing requires either a reverse index or accepting a full-graph scan, and the time/space trade-off between them.
Returning copies or read-only views so callers cannot mutate the snapshot.

Given a snapshot, a user u, and an integer k, recommend up to k users for u to follow using a friends-of-friends ranking:

Take F , the set of users u already follows.
For each f ∈ F , look at whom f follows.
Count, across all of u 's followees, how many times each candidate c appears.
Return the top- k candidates by that count, excluding u itself and anyone already in F .

What This Part Should Cover

Correct 2-hop traversal with the right exclusions (self, already-followed) applied before/while counting.
A top- k selection strategy with stated complexity and an explicit, deterministic tie-break rule.
Counting cost expressed in terms of followee count and average out-degree, e.g. $O\!\left(\sum_{f \in F} \deg^{+}(f)\right)$ .

What a Strong Answer Covers

These dimensions span all three parts.

A coherent snapshot strategy (full copy vs. copy-on-write / persistent structures vs. versioned edges) with the trade-offs that decide which fits the stated constraints, and a guarantee that snapshots are truly isolated from later mutations.
Consistent handling of the clarified edge cases (missing users, self-follow, idempotent follows) across every method.
Stated asymptotic time and space for follow , create_snapshot , and each snapshot query.

Follow-up Questions

If snapshots are taken very frequently on a large graph, full-copy snapshots become $O(V+E)$ each. How would persistent/copy-on-write data structures or versioned (append-only) edges reduce snapshot cost, and what do they cost you on the read path?
How would you add unfollow(u, v) while keeping every previously taken snapshot correct?
How would you make the live network safe under concurrent follow / create_snapshot from multiple threads?
The recommendation only uses follow-count. How would you incorporate signals like mutual follows, recency, or excluding blocked users without changing the asymptotic cost much?

Design a social network with snapshots

Company: OpenAI

Role: Software Engineer

Category: Software Engineering Fundamentals

Difficulty: medium

Interview Round: Technical Screen

sn = SocialNetwork() sn.add_user("A") sn.add_user("B") sn.follow("A", "B") snap = sn.create_snapshot() assert snap.is_follow("A", "B") sn.follow("A", "C") # mutates the live graph only assert not snap.is_follow("A", "C") # the old snapshot is unaffected

Design a social network with snapshots

Quick Overview

Design a social network with snapshots

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — Snapshot follow query

What This Part Should Cover

Part 2 — List followers and followees

What This Part Should Cover

What This Part Should Cover

What a Strong Answer Covers

Follow-up Questions

Write your answer

Design a social network with snapshots

Quick Overview

Design a social network with snapshots

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — Snapshot follow query

What This Part Should Cover

Part 2 — List followers and followees

What This Part Should Cover

What This Part Should Cover

What a Strong Answer Covers

Follow-up Questions

Write your answer