Design a topic-based news subscription system
Company: Optiver
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Take-home Project
Design and implement a topic-based news subscription service for a news aggregation platform. Provide a class NewsProvider supporting the following operations and constraints:
Operations
- AddSubscription(subscriberId, minInterest, maxNewsPerSecond, topics) -> bool
• Create or update a subscriber’s configuration.
• Return true on success; return false on invalid input or violated constraints.
- RemoveSubscription(subscriberId) -> bool
• Remove a subscriber; return false if the subscriberId does not exist.
- NewsReceived(newsId, timestamp, interest, topics) -> bool
• Ingest a news item with its timestamp, interest score, and topics.
• Return false if newsId already exists; otherwise true.
- Publish(nowTimestamp, maxAge) -> dict[int, list[int]]
• For each subscriber, return the list of newsIds to deliver at nowTimestamp, subject to the rules below.
Publish rules
- Eligibility: news interest ≥ subscriber.minInterest.
- Topic filter: news topics intersect subscriber.topics.
- Freshness: nowTimestamp - maxAge ≤ news.timestamp ≤ nowTimestamp.
- Rate limiting: per-subscriber sliding window allowing at most maxNewsPerSecond items in any 1-second rolling window.
- De-duplication: the same newsId must not be delivered to the same subscriber more than once.
- Ordering: higher interest first; if ties, older timestamp first; if still tied, smaller newsId first.
Deliverables
- Specify data structures for: subscriptions, news storage, delivered-tracking per subscriber, and per-subscriber sliding-window counters.
- Describe algorithms for filtering, rate limiting (sliding window), de-duplication, and multi-key sorting under high throughput.
- Analyze time and space complexity of each operation and Publish.
- Discuss concurrency, idempotency, and persistence considerations.
- Explain how you would scale this to millions of users and a large, continuous news stream (partitioning, sharding, backpressure, batching, and fault tolerance).
Quick Answer: This question evaluates system design skills for building a topic-based news subscription service, focusing on streaming ingestion, relevance filtering, sliding-window rate limiting, de-duplication, ordered delivery, and data structure choices for in-memory operation.