PracHub
QuestionsPremiumLearningGuidesCheatsheetNEW
|Home/Machine Learning/DoorDash

Design a Homepage Store Recommender

Last updated: Mar 29, 2026

Quick Overview

This question evaluates system-level machine learning and recommender competencies, including candidate retrieval, filtering and ranking, feature-store design, geospatial caching, low-latency serving, and experimentation infrastructure.

  • hard
  • DoorDash
  • Machine Learning
  • Data Scientist

Design a Homepage Store Recommender

Company: DoorDash

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You are designing the homepage store recommendation system for a food-delivery app similar to DoorDash. When a user opens the app, the online request contains very little context: primarily `user_id` and the user's current latitude/longitude. The system must return a ranked list of stores for the homepage feed under the following hard constraints: - Every recommended store must be within the user's delivery range. - Every recommended store must be currently open. - The system serves high traffic, so online latency and reliability are critical. - Assume each retrieval source has an aggressive timeout budget of about 15 ms. Design the end-to-end ML system, and address the following: 1. **Overall architecture** - How would you structure candidate retrieval, filtering, ranking, and serving? - What are the main online and offline components? 2. **Candidate retrieval** - How would you generate candidates given only `user_id` and location? - What retrieval channels would you include (for example: nearby popular stores, user affinity, cuisine/category similarity, embedding-based retrieval, cold-start fallbacks)? - How would you enforce the hard constraints on delivery eligibility and store open status? 3. **Geospatial caching** - How would you use a geospatial index such as Geohash, H3, or a grid system for caching or precomputing location-based candidate sets? - What would the cache key look like? - How would you handle cache invalidation when stores open/close or delivery eligibility changes? 4. **Extreme latency constraints** - If each retrieval path must finish within about 15 ms, how would you optimize fan-out and parallel fetching? - How would you degrade gracefully when one or more retrieval sources time out? 5. **Ranking and feature platform** - How would you build the ranking layer? - What objective would you optimize: click-through rate, order conversion, GMV, long-term retention, delivery quality, or some weighted combination? - How would you avoid feedback loops, popularity bias, and over-optimization for short-term clicks? 6. **Feature store design** - Different feature types exist: dense embeddings, numeric features, and categorical features. How would you store them differently at the database layer? - How would you key features using `user_id`, `store_id`, and possibly `user_id + store_id`? - How would you support hourly offline refreshes while preserving high-concurrency, low-latency online reads? 7. **Model iteration and experimentation** - Suppose model version V2.0 adds several new features relative to V1.1. How should the infrastructure support multiple model versions at once? - How would different A/B test treatments fetch different feature sets or feature configurations safely? - What offline and online metrics would you use to evaluate the change? 8. **Real-time versus batch features** - What are the tradeoffs between real-time features and offline batch-computed features in this system? - What failure modes appear when you add real-time features under strict latency requirements, such as timeouts, missing values, training-serving skew, and stability issues? - How would you decide which features must be real-time versus batch? Your answer should include system architecture, storage choices, ML tradeoffs, experimentation strategy, and operational safeguards.

Quick Answer: This question evaluates system-level machine learning and recommender competencies, including candidate retrieval, filtering and ranking, feature-store design, geospatial caching, low-latency serving, and experimentation infrastructure.

Related Interview Questions

  • Design a Low-Latency Store Recommender - DoorDash (hard)
  • How would you target promotions to grow consumers? - DoorDash (medium)
  • Design and evaluate an uplift model - DoorDash (hard)
  • Build ETA prediction and simulate impact - DoorDash (hard)
  • Build a late-delivery risk model - DoorDash (hard)
DoorDash logo
DoorDash
Mar 15, 2026, 12:00 AM
Data Scientist
Onsite
Machine Learning
11
0
Loading...

You are designing the homepage store recommendation system for a food-delivery app similar to DoorDash. When a user opens the app, the online request contains very little context: primarily user_id and the user's current latitude/longitude.

The system must return a ranked list of stores for the homepage feed under the following hard constraints:

  • Every recommended store must be within the user's delivery range.
  • Every recommended store must be currently open.
  • The system serves high traffic, so online latency and reliability are critical.
  • Assume each retrieval source has an aggressive timeout budget of about 15 ms.

Design the end-to-end ML system, and address the following:

  1. Overall architecture
    • How would you structure candidate retrieval, filtering, ranking, and serving?
    • What are the main online and offline components?
  2. Candidate retrieval
    • How would you generate candidates given only user_id and location?
    • What retrieval channels would you include (for example: nearby popular stores, user affinity, cuisine/category similarity, embedding-based retrieval, cold-start fallbacks)?
    • How would you enforce the hard constraints on delivery eligibility and store open status?
  3. Geospatial caching
    • How would you use a geospatial index such as Geohash, H3, or a grid system for caching or precomputing location-based candidate sets?
    • What would the cache key look like?
    • How would you handle cache invalidation when stores open/close or delivery eligibility changes?
  4. Extreme latency constraints
    • If each retrieval path must finish within about 15 ms, how would you optimize fan-out and parallel fetching?
    • How would you degrade gracefully when one or more retrieval sources time out?
  5. Ranking and feature platform
    • How would you build the ranking layer?
    • What objective would you optimize: click-through rate, order conversion, GMV, long-term retention, delivery quality, or some weighted combination?
    • How would you avoid feedback loops, popularity bias, and over-optimization for short-term clicks?
  6. Feature store design
    • Different feature types exist: dense embeddings, numeric features, and categorical features. How would you store them differently at the database layer?
    • How would you key features using user_id , store_id , and possibly user_id + store_id ?
    • How would you support hourly offline refreshes while preserving high-concurrency, low-latency online reads?
  7. Model iteration and experimentation
    • Suppose model version V2.0 adds several new features relative to V1.1. How should the infrastructure support multiple model versions at once?
    • How would different A/B test treatments fetch different feature sets or feature configurations safely?
    • What offline and online metrics would you use to evaluate the change?
  8. Real-time versus batch features
    • What are the tradeoffs between real-time features and offline batch-computed features in this system?
    • What failure modes appear when you add real-time features under strict latency requirements, such as timeouts, missing values, training-serving skew, and stability issues?
    • How would you decide which features must be real-time versus batch?

Your answer should include system architecture, storage choices, ML tradeoffs, experimentation strategy, and operational safeguards.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More DoorDash•More Data Scientist•DoorDash Data Scientist•DoorDash Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.