Design personalized discovery recommendations
Company: Perplexity
Role: Software Engineer
Category: ML System Design
Difficulty: medium
Interview Round: Onsite
You are designing a personalized "Discovery" page for an AI-powered search/Q&A platform (similar to Perplexity). The Discovery page should show each user a feed of interesting questions, answers, and topics to explore.
**Requirements and goals**:
- The feed must be **personalized per user** based on their historical activity (queries, clicks, likes, time spent, etc.).
- Content candidates include: popular queries, high-quality answers, curated topics, and long-form threads.
- The system must support **millions of users** and **high QPS** (e.g., thousands of requests per second) with a latency budget of ~100–200 ms for generating the feed.
- The system should continuously improve over time using user feedback signals.
- It should handle **cold-start** users and **cold-start** items.
**Design task**:
Describe how you would design an end-to-end **recommender system** for this Discovery page, including:
1. What data you would collect and how (events, logs, user and item data).
2. The high-level architecture (storage, offline pipelines, online serving components).
3. How you would generate candidates (candidate retrieval) and then rank them (ML models, features).
4. How to personalize the feed to each user in real time within the latency constraints.
5. How you would measure success and set up feedback loops for model improvement.
6. How you would handle cold-start users/items and basic abuse/spam control.
You do not need to provide implementation code, but you should describe the main components, their responsibilities, and the data and ML aspects at a reasonable level of detail.
Quick Answer: This question evaluates understanding of recommender systems and personalization, covering skills in data collection and event logging, candidate retrieval, ML ranking, real-time serving, scalability, feedback loops, cold-start handling, and abuse mitigation.