Feed Systems and Fanout Architecture

What's being tested

Interviewers are probing whether you can design a high-scale feed system where users publish content and other users consume a personalized timeline with low latency. For a Software Engineer, the core skill is choosing between fanout-on-write, fanout-on-read, and hybrid fanout under realistic constraints: follower graph skew, celebrity accounts, freshness, ordering, storage cost, and failure handling. Google cares because the same patterns appear in YouTube, Google Photos, Google Play, notifications, activity streams, and collaborative products: many producers, many consumers, strict latency expectations, and uneven traffic distribution.

Core knowledge

Fanout-on-write precomputes timelines when a post is created: for each follower, insert the post ID into that follower’s feed store. Reads are fast, often O(1) or O(log n) over a bounded timeline, but writes can explode for users with millions of followers.
Fanout-on-read computes the feed at request time: fetch the user’s followees, retrieve recent posts for each, merge, rank, and paginate. Writes are cheap, but reads become expensive for users following thousands of accounts unless aggressively cached or bounded.
Hybrid fanout is the usual production answer. Normal users use fanout-on-write; high-follower “celebrity” users are handled by fanout-on-read or lazy merge. A common threshold might be accounts with >100k or >1M followers, tuned by write amplification and read latency.
Write amplification is the key sizing equation: if user $u$ has $F_u$ followers and creates $P_u$ posts/day, fanout writes are approximately $\sum_u F_u P_u$ . One celebrity post to 50M followers can dominate the system more than millions of normal posts.
Timeline storage usually stores references, not full content: (user_id, post_id, author_id, created_at, score/version). The post body, media URL, permissions, and metadata live in a separate content store such as Spanner, Bigtable, DynamoDB, or object storage.
Follower graph storage must support “followers of author” for write fanout and “followees of reader” for read fanout. This can be served from a graph service, sharded key-value store, or wide-column store; the important interview point is access pattern, not a specific database brand.
Message queues such as Pub/Sub, Kafka, or SQS decouple post creation from fanout workers. The post service should not synchronously write millions of feed entries; it should enqueue fanout jobs, return quickly, and let workers process with retries and backpressure.
Idempotency is mandatory because fanout workers retry. Use a deterministic key like (recipient_user_id, post_id) and make timeline insertion idempotent via upsert, conditional write, or deduplication. Otherwise duplicate feed items appear after worker crashes or queue redelivery.
Ordering and pagination should avoid offset pagination at scale. Use cursor-based pagination with (rank_score, created_at, post_id) or (created_at, post_id) as a stable cursor. Offset pagination becomes slow and inconsistent as new posts arrive.
Caching usually has multiple layers: hot user timelines in Redis/Memcached, post objects in cache, and precomputed first pages. Optimize for p50 and p99: feed open is latency-sensitive, so target something like <200ms for cached home feed reads.
Consistency tradeoffs should be explicit. Feeds are usually eventually consistent: a follower may see a post seconds later, and deletion/privacy updates need separate invalidation paths. For deletes or blocks, prefer correctness over freshness by filtering at read time even if stale entries remain in timeline storage.
Skew and hotspots are central edge cases. Celebrity authors, viral posts, users with huge follow graphs, and cache stampedes can overload a shard. Mitigations include sharding by user_id, rate-limited fanout, batching, celebrity bypass, request coalescing, and prewarming hot feed pages.

Worked example

For Design Twitter News Feed, a strong candidate starts by clarifying scale and semantics: number of users, posts per day, average and max follower count, freshness requirement, whether the feed is strictly reverse-chronological, and whether likes/replies are in scope. They would declare assumptions such as “home timeline should load in under 200ms for cached reads, posts can appear with a few seconds of delay, and I’ll store post IDs in timelines rather than full post bodies.”

The answer can be organized around four pillars: data model, write path, read path, and failure/scale handling. In the write path, PostService persists the post, publishes a fanout event to Pub/Sub or Kafka, and workers fetch the author’s followers and insert (recipient_id, post_id, created_at) into timeline shards. In the read path, FeedService reads the viewer’s precomputed timeline from cache or a timeline store, hydrates post objects from the content service, filters deleted/blocked/private content, and returns cursor-paginated results.

The key tradeoff to flag is fanout strategy: pure fanout-on-write gives excellent read latency but fails for celebrity users, so use hybrid fanout. Normal authors are pushed into follower timelines; celebrity posts are pulled at read time by merging recent posts from followed celebrity accounts into the precomputed feed. A good close is: “If I had more time, I’d go deeper on cache invalidation for deletes/blocks, ranking hooks, and operational metrics like fanout lag, queue depth, duplicate insert rate, and feed read p99.”

A second angle

For Design Instagram Home Feed, the same fanout architecture applies, but media hydration and object size become more prominent. The timeline should still store lightweight post references, while images/videos are served through a media service and CDN; feed read latency depends heavily on metadata fetches and prefetching. The write/read fanout split may also differ because posting frequency is lower than short-text systems, but each feed item is heavier and more expensive to hydrate. A strong answer would preserve the same hybrid fanout core, then discuss media metadata caching, pagination stability, and filtering unavailable or deleted media at read time.

Common pitfalls

Pitfall: Choosing pure fanout-on-write without discussing celebrity users.

The tempting answer is “when someone posts, write to all followers’ feeds,” which sounds simple and gives fast reads. What lands better is: “That works for normal users, but a user with 50M followers creates massive write amplification, so I’ll use hybrid fanout and pull celebrity posts at read time.”

Pitfall: Treating the feed as one database query.

A weak design says “join users, follows, and posts ordered by timestamp,” which ignores scale, hotspots, and latency. A stronger answer separates the follower graph, post store, timeline store, cache, and asynchronous fanout workers, then explains how data flows between them.

Pitfall: Ignoring correctness on deletes, blocks, and privacy changes.

Many candidates only optimize post creation and feed reads. Interviewers often probe edge cases: if Alice blocks Bob, Bob must not keep seeing Alice’s stale posts just because they were already fanned out, so the read path should enforce permissions even when timeline entries are stale.

Connections

Interviewers may pivot from feed fanout into distributed queues, cache invalidation, database sharding, rate limiting, or ranking service integration. The same design patterns also show up in notification systems, activity logs, pub/sub delivery, collaborative document updates, and recommendation candidate generation.