Design follow/follower classes
Company: Uber
Role: Software Engineer
Category: Software Engineering Fundamentals
Difficulty: medium
Interview Round: Technical Screen
## Object-Oriented Design: Follow / Follower Relationship
Design a small set of classes for a social product where users can **follow** and **unfollow** each other. The follow relationship is **directed** — `a` following `b` does not imply `b` follows `a` — and the system must answer relationship queries efficiently in *both* directions (who a user follows, and who follows a user).
Your design must support this API:
- `follow(a, b)` — `a` starts following `b`
- `unfollow(a, b)` — `a` stops following `b`
- `getFollowers(user)` → all users who follow `user`
- `getFollowing(user)` → all users whom `user` follows
- `isFollowing(a, b)` → whether `a` follows `b`
- `isMutual(a, b)` → whether `a` follows `b` **and** `b` follows `a`
Define the classes and their responsibilities, choose the data structures, and state the time/space complexity of each operation. Make `follow`/`unfollow` **idempotent**, decide and justify the behavior of **self-follow** (`follow(a, a)`), and explain how you keep the two query directions consistent with each other.
```hint Frame the problem
Strip away the API and ask what abstract structure this is. Users, and a directed "follows" relation between them — what classic data model is that? Naming it gives you a vocabulary (nodes, directed edges, in-edges vs out-edges) that the rest of the design falls out of.
```
```hint The bidirectional tension
Sketch the cheapest structure that answers `getFollowing` in constant time. Now ask: with *that* structure, what does `getFollowers` cost? Whichever direction you optimize for, the opposite query is the one that hurts — and the prompt says both must be fast. Sit with that tension before reaching for a fix; it's the crux the interviewer is probing.
```
```hint Redundancy has a price
Suppose you decide to optimize both directions at once. Whatever you add is, by definition, a second view of the very same set of edges — so the two views can disagree. Where in your code does a single `follow` or `unfollow` keep them from drifting apart, and what would you write down to make "they agree" a checkable property? Separately: what should each mutator *return* so a caller can tell a real change from a repeat?
```
### Constraints & Assumptions
- All users are already created and identified by a unique, comparable `userId` (e.g. a 64-bit integer).
- The relationship graph fits in memory for the core design; you should additionally sketch how it maps to durable storage.
- Realistic social-graph skew: most users follow $10$–$10^3$ accounts, but a few "celebrity" accounts have $10^6$+ followers (a hot key you should acknowledge).
- Workload is **read-heavy**: `isFollowing` and feed assembly vastly outnumber `follow`/`unfollow` writes.
- Thread-safety is a design dimension to discuss; you may present the core logic single-threaded first.
### Clarifying Questions to Ask
- Is the relationship directed (Twitter/X-style) or symmetric (Facebook-friends-style)? This changes how many query directions you must serve.
- Should `follow`/`unfollow` be strictly idempotent no-ops, or signal whether state actually changed?
- What happens on operations referencing an unknown `userId` — throw, return `false`, or lazily create?
- Do `getFollowers`/`getFollowing` need pagination and a stable ordering (e.g. most-recent-first), or is set membership enough?
- Is this single-process or distributed/sharded, and must it survive restarts (persistence)?
- What read/write ratio and scale should I target, and is eventual consistency between the query directions acceptable?
### What a Strong Answer Covers
- Recognizes the underlying data model and proposes a clean class decomposition with a justified boundary between the user identity and whatever owns the relationships.
- Chooses data structures that make **both** query directions efficient, and can articulate the cost paid (in storage and write work) for that choice and why it's the right trade under a read-heavy workload.
- Correct time/space complexity for every operation, with a justified collection type that gives idempotent membership.
- A stated consistency property between the query directions and an argument for how each mutation preserves it atomically.
- Deliberate, reasoned handling of self-follow, repeated calls, unknown users, and empty-set cleanup.
- Encapsulation: not leaking internal mutable collections (immutable views or defensive copies).
- A concurrency strategy (locking granularity, atomicity of the multi-step update) suited to the read-heavy workload.
- A credible persistence mapping (table, uniqueness, indexes serving each direction) and awareness of the celebrity hot key.
### Follow-up Questions
- How would you persist this in a relational database? Give the schema, the uniqueness constraint, and the indexes needed to serve both `getFollowers` and `getFollowing` efficiently.
- A celebrity with 50M followers makes `getFollowers` huge — how do you paginate, and how does that interact with your in-memory representation?
- Under concurrent `follow`/`unfollow`, how do you keep the two query directions consistent without deadlocking? Compare a coarse lock, per-user locks with lock ordering, and concurrent maps.
- Shard the follow graph across machines once it no longer fits on one host: how do you shard the edges, and what does it do to the `isMutual` check and to read-your-own-write consistency right after a `follow`?
Quick Answer: This question evaluates object-oriented design and data-structure modeling skills for directed graphs, testing competency in representing follow relationships, supporting efficient bidirectional queries, ensuring consistency and idempotence of mutators, and reasoning about scalability and hot-key follower skew.