Design a news feed aggregator
Company: Rippling
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Take-home Project
Design a large-scale news aggregator and personalized feed system.
Requirements:
- Ingest articles from thousands of publishers via RSS/webhooks/APIs; handle rate limits, retries, failures, and deduplication.
- Support near-real-time updates (<5 seconds end-to-end) and backfill for missed content.
- Provide a personalized ranked feed per user using signals (followed sources, topics, recency, engagement) and experimentation support (A/B, holdouts).
- Implement search and topic/tag pages; support geo and language filters.
- Ensure idempotent ingestion, content normalization, media handling, and spam/NSFW detection.
- Design storage (hot/cold), indexing, and caching layers; include data models and schemas.
- Describe feed generation strategy (pull vs. push; fan-out-on-write/read) and trade-offs.
- Plan for multi-region availability, eventual consistency, and disaster recovery.
- Estimate capacity (QPS, throughput, data volume), scaling strategy, and cost controls.
- Define external/internal APIs, rate limiting, authentication/authorization, and user privacy/compliance (GDPR/CCPA).
- Include monitoring, alerting, logging, and SLOs; describe rollback and incident mitigation.
- Provide a phased rollout and end-to-end testing plan.
Quick Answer: This question evaluates a candidate's competence in designing large-scale distributed systems, including real-time ingestion, personalized ranking, data modeling and indexing, search, scalability, reliability, security, and operational practices.