PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Software Engineering Fundamentals/OpenAI

Design a social network with snapshots

Last updated: Jun 21, 2026

Quick Overview

This question evaluates understanding of data structures, graph representations, immutability and snapshot/versioning strategies for in-memory systems, along with algorithmic reasoning for query and recommendation operations.

  • medium
  • OpenAI
  • Software Engineering Fundamentals
  • Software Engineer

Design a social network with snapshots

Company: OpenAI

Role: Software Engineer

Category: Software Engineering Fundamentals

Difficulty: medium

Interview Round: Technical Screen

You are asked to design and implement an in-memory `SocialNetwork` class that supports users following each other and creating **snapshots** of the follow graph. A snapshot is an **immutable view** of all follow relationships at the moment it was created; later mutations to the live network must not change any previously taken snapshot. Implement an API similar to: ```text sn = SocialNetwork() sn.add_user("A") sn.add_user("B") sn.follow("A", "B") snap = sn.create_snapshot() assert snap.is_follow("A", "B") sn.follow("A", "C") # mutates the live graph only assert not snap.is_follow("A", "C") # the old snapshot is unaffected ``` The follow relation is **directed**: `follow(u, v)` means *u follows v* and does not imply *v follows u*. Your task is to design the underlying data structures, choose a snapshot strategy, and implement the three operations below. ### Constraints & Assumptions - In-memory, single-process; no persistence required. - User identifiers are hashable (treat them as strings/ints). - `is_follow`, listing followers/followees, and recommendations all operate **on a given snapshot**, not the live graph. - Snapshots may be taken frequently relative to the number of edges; reads on a snapshot should be cheap. - State the asymptotic time/space cost of each operation and of `create_snapshot` itself. ### Clarifying Questions to Ask - If `u` or `v` does not exist when calling `follow` / `is_follow` / a listing, should the method throw, auto-create the user, or return a benign default (false / empty)? - Is self-follow (`follow(u, u)`) permitted, or rejected? - Are repeated `follow(u, v)` calls idempotent (no duplicate edge, no inflated counts)? - Roughly how many snapshots are expected versus how many users/edges — i.e. is the bottleneck snapshot creation, snapshot memory, or query latency? - Is there an `unfollow` operation, and must snapshots taken before/after it reflect the right state? ### Part 1 — Snapshot follow query Support `snap.is_follow(u, v)` returning whether user `u` follows user `v` **as of that snapshot**. ```hint Data structure The query is membership in *u*'s set of followees. A `Map<User, Set<User>>` keyed by the follower gives $O(1)$ average lookup; decide what to return when `u` is absent from the snapshot. ``` #### What This Part Should Cover - The core adjacency representation chosen and why it makes `is_follow` average $O(1)$. - Clear, consistent behavior when `u` (or `v`) is not present in the snapshot. ### Part 2 — List followers and followees For a given snapshot and user `x`, return: - the **followers** of `x` (everyone who follows `x`), and - the **followees** of `x` (everyone `x` follows). ```hint Avoid scanning the whole graph Returning followers from an out-edge map alone forces an $O(V+E)$ scan. Maintaining a **second** adjacency map of *incoming* edges (`in[v] = followers of v`) makes both listings $O(\text{degree})$, at the cost of updating two maps on every `follow`. ``` #### What This Part Should Cover - Recognizing that efficient follower listing requires either a reverse index or accepting a full-graph scan, and the time/space trade-off between them. - Returning copies or read-only views so callers cannot mutate the snapshot. ### Part 3 — Recommend users to follow Given a snapshot, a user `u`, and an integer `k`, recommend up to `k` users for `u` to follow using a **friends-of-friends** ranking: 1. Take `F`, the set of users `u` already follows. 2. For each `f ∈ F`, look at whom `f` follows. 3. Count, across all of `u`'s followees, how many times each candidate `c` appears. 4. Return the top-`k` candidates by that count, excluding `u` itself and anyone already in `F`. ```hint Where to start This is a 2-hop traversal: iterate `u`'s followees, then *their* followees, accumulating a frequency count in a hash map. Decide the exclusion set (`u`, plus everyone already in `F`) before counting. ``` ```hint Top-k selection You don't need to fully sort all `m` candidates. A size-`k` min-heap keyed by count gives the top-`k` in $O(m \log k)$ vs. $O(m \log m)$ for a full sort. Define a deterministic tie-break (e.g. by user id) so results are reproducible. ``` #### What This Part Should Cover - Correct 2-hop traversal with the right exclusions (self, already-followed) applied before/while counting. - A top-`k` selection strategy with stated complexity and an explicit, deterministic tie-break rule. - Counting cost expressed in terms of followee count and average out-degree, e.g. $O\!\left(\sum_{f \in F} \deg^{+}(f)\right)$. ### What a Strong Answer Covers These dimensions span all three parts. - A coherent **snapshot strategy** (full copy vs. copy-on-write / persistent structures vs. versioned edges) with the trade-offs that decide which fits the stated constraints, and a guarantee that snapshots are truly isolated from later mutations. - Consistent handling of the clarified edge cases (missing users, self-follow, idempotent follows) across every method. - Stated asymptotic time and space for `follow`, `create_snapshot`, and each snapshot query. ### Follow-up Questions - If snapshots are taken very frequently on a large graph, full-copy snapshots become $O(V+E)$ each. How would persistent/copy-on-write data structures or versioned (append-only) edges reduce snapshot cost, and what do they cost you on the read path? - How would you add `unfollow(u, v)` while keeping every previously taken snapshot correct? - How would you make the live network safe under concurrent `follow`/`create_snapshot` from multiple threads? - The recommendation only uses follow-count. How would you incorporate signals like mutual follows, recency, or excluding blocked users without changing the asymptotic cost much?

Quick Answer: This question evaluates understanding of data structures, graph representations, immutability and snapshot/versioning strategies for in-memory systems, along with algorithmic reasoning for query and recommendation operations.

Related Interview Questions

  • Implement a Simple Memory Allocator - OpenAI (medium)
  • Implement an Extensible Chatbot App - OpenAI (medium)
  • Build a Reliable Streaming Chat UI - OpenAI (hard)
  • Design an Extensible Simulation Engine - OpenAI (hard)
  • Debug a Concurrent Job Scheduler - OpenAI (medium)
|Home/Software Engineering Fundamentals/OpenAI

Design a social network with snapshots

OpenAI logo
OpenAI
Nov 16, 2025, 12:00 AM
mediumSoftware EngineerTechnical ScreenSoftware Engineering Fundamentals
34
0
Loading...

You are asked to design and implement an in-memory SocialNetwork class that supports users following each other and creating snapshots of the follow graph. A snapshot is an immutable view of all follow relationships at the moment it was created; later mutations to the live network must not change any previously taken snapshot.

Implement an API similar to:

sn = SocialNetwork()
sn.add_user("A")
sn.add_user("B")
sn.follow("A", "B")

snap = sn.create_snapshot()
assert snap.is_follow("A", "B")

sn.follow("A", "C")          # mutates the live graph only
assert not snap.is_follow("A", "C")   # the old snapshot is unaffected

The follow relation is directed: follow(u, v) means u follows v and does not imply v follows u. Your task is to design the underlying data structures, choose a snapshot strategy, and implement the three operations below.

Constraints & Assumptions

  • In-memory, single-process; no persistence required.
  • User identifiers are hashable (treat them as strings/ints).
  • is_follow , listing followers/followees, and recommendations all operate on a given snapshot , not the live graph.
  • Snapshots may be taken frequently relative to the number of edges; reads on a snapshot should be cheap.
  • State the asymptotic time/space cost of each operation and of create_snapshot itself.

Clarifying Questions to Ask

  • If u or v does not exist when calling follow / is_follow / a listing, should the method throw, auto-create the user, or return a benign default (false / empty)?
  • Is self-follow ( follow(u, u) ) permitted, or rejected?
  • Are repeated follow(u, v) calls idempotent (no duplicate edge, no inflated counts)?
  • Roughly how many snapshots are expected versus how many users/edges — i.e. is the bottleneck snapshot creation, snapshot memory, or query latency?
  • Is there an unfollow operation, and must snapshots taken before/after it reflect the right state?

Part 1 — Snapshot follow query

Support snap.is_follow(u, v) returning whether user u follows user v as of that snapshot.

What This Part Should Cover

  • The core adjacency representation chosen and why it makes is_follow average O(1)O(1)O(1) .
  • Clear, consistent behavior when u (or v ) is not present in the snapshot.

Part 2 — List followers and followees

For a given snapshot and user x, return:

  • the followers of x (everyone who follows x ), and
  • the followees of x (everyone x follows).

What This Part Should Cover

  • Recognizing that efficient follower listing requires either a reverse index or accepting a full-graph scan, and the time/space trade-off between them.
  • Returning copies or read-only views so callers cannot mutate the snapshot.

Part 3 — Recommend users to follow

Given a snapshot, a user u, and an integer k, recommend up to k users for u to follow using a friends-of-friends ranking:

  1. Take F , the set of users u already follows.
  2. For each f ∈ F , look at whom f follows.
  3. Count, across all of u 's followees, how many times each candidate c appears.
  4. Return the top- k candidates by that count, excluding u itself and anyone already in F .

What This Part Should Cover

  • Correct 2-hop traversal with the right exclusions (self, already-followed) applied before/while counting.
  • A top- k selection strategy with stated complexity and an explicit, deterministic tie-break rule.
  • Counting cost expressed in terms of followee count and average out-degree, e.g. O ⁣(∑f∈Fdeg⁡+(f))O\!\left(\sum_{f \in F} \deg^{+}(f)\right)O(∑f∈F​deg+(f)) .

What a Strong Answer Covers

These dimensions span all three parts.

  • A coherent snapshot strategy (full copy vs. copy-on-write / persistent structures vs. versioned edges) with the trade-offs that decide which fits the stated constraints, and a guarantee that snapshots are truly isolated from later mutations.
  • Consistent handling of the clarified edge cases (missing users, self-follow, idempotent follows) across every method.
  • Stated asymptotic time and space for follow , create_snapshot , and each snapshot query.

Follow-up Questions

  • If snapshots are taken very frequently on a large graph, full-copy snapshots become O(V+E)O(V+E)O(V+E) each. How would persistent/copy-on-write data structures or versioned (append-only) edges reduce snapshot cost, and what do they cost you on the read path?
  • How would you add unfollow(u, v) while keeping every previously taken snapshot correct?
  • How would you make the live network safe under concurrent follow / create_snapshot from multiple threads?
  • The recommendation only uses follow-count. How would you incorporate signals like mutual follows, recency, or excluding blocked users without changing the asymptotic cost much?
Loading comments...

Browse More Questions

More Software Engineering Fundamentals•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI Software Engineering Fundamentals•Software Engineer Software Engineering Fundamentals

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.