PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/DoorDash

Design a Homepage Store Recommender

Last updated: Mar 29, 2026

Quick Overview

This question evaluates system-level machine learning and recommender competencies, including candidate retrieval, filtering and ranking, feature-store design, geospatial caching, low-latency serving, and experimentation infrastructure.

  • hard
  • DoorDash
  • Machine Learning
  • Data Scientist

Design a Homepage Store Recommender

Company: DoorDash

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You are designing the homepage store recommendation system for a food-delivery app similar to DoorDash. When a user opens the app, the online request contains very little context: primarily `user_id` and the user's current latitude/longitude. The system must return a ranked list of stores for the homepage feed under the following hard constraints: - Every recommended store must be within the user's delivery range. - Every recommended store must be currently open. - The system serves high traffic, so online latency and reliability are critical. - Assume each retrieval source has an aggressive timeout budget of about 15 ms. Design the end-to-end ML system, and address the following: 1. **Overall architecture** - How would you structure candidate retrieval, filtering, ranking, and serving? - What are the main online and offline components? 2. **Candidate retrieval** - How would you generate candidates given only `user_id` and location? - What retrieval channels would you include (for example: nearby popular stores, user affinity, cuisine/category similarity, embedding-based retrieval, cold-start fallbacks)? - How would you enforce the hard constraints on delivery eligibility and store open status? 3. **Geospatial caching** - How would you use a geospatial index such as Geohash, H3, or a grid system for caching or precomputing location-based candidate sets? - What would the cache key look like? - How would you handle cache invalidation when stores open/close or delivery eligibility changes? 4. **Extreme latency constraints** - If each retrieval path must finish within about 15 ms, how would you optimize fan-out and parallel fetching? - How would you degrade gracefully when one or more retrieval sources time out? 5. **Ranking and feature platform** - How would you build the ranking layer? - What objective would you optimize: click-through rate, order conversion, GMV, long-term retention, delivery quality, or some weighted combination? - How would you avoid feedback loops, popularity bias, and over-optimization for short-term clicks? 6. **Feature store design** - Different feature types exist: dense embeddings, numeric features, and categorical features. How would you store them differently at the database layer? - How would you key features using `user_id`, `store_id`, and possibly `user_id + store_id`? - How would you support hourly offline refreshes while preserving high-concurrency, low-latency online reads? 7. **Model iteration and experimentation** - Suppose model version V2.0 adds several new features relative to V1.1. How should the infrastructure support multiple model versions at once? - How would different A/B test treatments fetch different feature sets or feature configurations safely? - What offline and online metrics would you use to evaluate the change? 8. **Real-time versus batch features** - What are the tradeoffs between real-time features and offline batch-computed features in this system? - What failure modes appear when you add real-time features under strict latency requirements, such as timeouts, missing values, training-serving skew, and stability issues? - How would you decide which features must be real-time versus batch? Your answer should include system architecture, storage choices, ML tradeoffs, experimentation strategy, and operational safeguards.

Quick Answer: This question evaluates system-level machine learning and recommender competencies, including candidate retrieval, filtering and ranking, feature-store design, geospatial caching, low-latency serving, and experimentation infrastructure.

Related Interview Questions

  • Design a Low-Latency Store Recommender - DoorDash (hard)
  • How would you target promotions to grow consumers? - DoorDash (medium)
  • Design and evaluate an uplift model - DoorDash (hard)
  • Build ETA prediction and simulate impact - DoorDash (hard)
  • Build a late-delivery risk model - DoorDash (hard)
|Home/Machine Learning/DoorDash

Design a Homepage Store Recommender

DoorDash logo
DoorDash
Mar 15, 2026, 12:00 AM
hardData ScientistOnsiteMachine Learning
17
0
Loading...

You are designing the homepage store recommendation system for a food-delivery app similar to DoorDash. When a user opens the app, the online request contains very little context: primarily user_id and the user's current latitude/longitude.

The system must return a ranked list of stores for the homepage feed under the following hard constraints:

  • Every recommended store must be within the user's delivery range.
  • Every recommended store must be currently open.
  • The system serves high traffic, so online latency and reliability are critical.
  • Assume each retrieval source has an aggressive timeout budget of about 15 ms.

Design the end-to-end ML system, and address the following:

  1. Overall architecture
    • How would you structure candidate retrieval, filtering, ranking, and serving?
    • What are the main online and offline components?
  2. Candidate retrieval
    • How would you generate candidates given only user_id and location?
    • What retrieval channels would you include (for example: nearby popular stores, user affinity, cuisine/category similarity, embedding-based retrieval, cold-start fallbacks)?
    • How would you enforce the hard constraints on delivery eligibility and store open status?
  3. Geospatial caching
    • How would you use a geospatial index such as Geohash, H3, or a grid system for caching or precomputing location-based candidate sets?
    • What would the cache key look like?
    • How would you handle cache invalidation when stores open/close or delivery eligibility changes?
  4. Extreme latency constraints
    • If each retrieval path must finish within about 15 ms, how would you optimize fan-out and parallel fetching?
    • How would you degrade gracefully when one or more retrieval sources time out?
  5. Ranking and feature platform
    • How would you build the ranking layer?
    • What objective would you optimize: click-through rate, order conversion, GMV, long-term retention, delivery quality, or some weighted combination?
    • How would you avoid feedback loops, popularity bias, and over-optimization for short-term clicks?
  6. Feature store design
    • Different feature types exist: dense embeddings, numeric features, and categorical features. How would you store them differently at the database layer?
    • How would you key features using user_id , store_id , and possibly user_id + store_id ?
    • How would you support hourly offline refreshes while preserving high-concurrency, low-latency online reads?
  7. Model iteration and experimentation
    • Suppose model version V2.0 adds several new features relative to V1.1. How should the infrastructure support multiple model versions at once?
    • How would different A/B test treatments fetch different feature sets or feature configurations safely?
    • What offline and online metrics would you use to evaluate the change?
  8. Real-time versus batch features
    • What are the tradeoffs between real-time features and offline batch-computed features in this system?
    • What failure modes appear when you add real-time features under strict latency requirements, such as timeouts, missing values, training-serving skew, and stability issues?
    • How would you decide which features must be real-time versus batch?

Your answer should include system architecture, storage choices, ML tradeoffs, experimentation strategy, and operational safeguards.

Loading comments...

Browse More Questions

More Machine Learning•More DoorDash•More Data Scientist•DoorDash Data Scientist•DoorDash Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.