PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/DoorDash

Design a Low-Latency Store Recommender

Last updated: Apr 2, 2026

Quick Overview

This question evaluates system design and machine learning competencies for real-time, low-latency store recommendation systems, including retrieval, pre-ranking and ranking, geospatial caching, feature serving, model versioning, and experimentation.

  • hard
  • DoorDash
  • Machine Learning
  • Data Scientist

Design a Low-Latency Store Recommender

Company: DoorDash

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You are designing the home-page store recommendation system for a food delivery app such as DoorDash. A request contains very little context: primarily **user_id** and the user's **current latitude/longitude**. The system must return a ranked list of stores for the app home page. ## Hard constraints - Recommended stores must be **within the deliveryable area** for the user. - Recommended stores must be **open at request time**. - The system is latency-sensitive and powers the home page. ## Product goal Design a recommendation system that maximizes long-term business value, such as orders or contribution profit, while balancing user engagement, relevance, freshness, and system latency. Discuss what primary metric you would optimize for and what guardrail metrics you would monitor. ## What to cover 1. **End-to-end architecture** - Describe the online request flow from request intake to final ranked list. - Explain how you would structure retrieval, pre-ranking, ranking, and post-processing. - Discuss how you would handle cold-start users, sparse geographies, and new stores. 2. **Retrieval design** - Propose multiple candidate-generation strategies, given that the online inputs are limited. - Explain how you would ensure all candidates satisfy delivery-range and open-now constraints. - Discuss how you would merge, deduplicate, and budget candidates across retrieval channels. 3. **Location-aware caching** - How would you use a geospatial indexing scheme such as Geohash or H3 to support caching? - Would you precompute popular stores per grid cell offline? - What cache key, TTL, and invalidation strategy would you use, especially when store availability and open status change frequently? 4. **Extreme latency constraint** - Suppose each retrieval route has a very strict timeout budget, for example **15 ms**. - How would you optimize parallel fan-out, partial results, fallback behavior, and service-level reliability under such a tight budget? 5. **Ranking and feature platform** - Design the feature-serving infrastructure for different feature types: **embeddings**, **numeric features**, and **categorical features**. - Explain how you would store and serve features keyed by **store_id**, **user_id**, and possibly **user-store pairs**. - Assume offline feature pipelines refresh hourly. How would you support high-concurrency online inference while keeping features reasonably fresh and point-in-time correct? 6. **Model versioning and experimentation** - Model version V2.0 adds new features relative to V1.1. - How would your infrastructure support multiple model versions without breaking online serving? - How would you configure different treatment groups in an A/B test to fetch different feature sets or model artifacts? - What experiment design choices would you make, including randomization unit, success metrics, guardrails, and failure detection? 7. **Real-time versus batch features** - Discuss the trade-offs between adding real-time features and relying on offline batch features. - Under strict latency requirements, what can go wrong if you overuse real-time features? - How would you design graceful degradation for feature timeouts, missing values, or upstream instability? Your answer should explicitly address modeling trade-offs, latency and reliability constraints, experimentation, and common production pitfalls such as training-serving skew, missing features, and marketplace-side side effects.

Quick Answer: This question evaluates system design and machine learning competencies for real-time, low-latency store recommendation systems, including retrieval, pre-ranking and ranking, geospatial caching, feature serving, model versioning, and experimentation.

Related Interview Questions

  • Design a Homepage Store Recommender - DoorDash (hard)
  • How would you target promotions to grow consumers? - DoorDash (medium)
  • Design and evaluate an uplift model - DoorDash (hard)
  • Build ETA prediction and simulate impact - DoorDash (hard)
  • Build a late-delivery risk model - DoorDash (hard)
DoorDash logo
DoorDash
Jan 13, 2026, 12:00 AM
Data Scientist
Onsite
Machine Learning
5
0
Loading...

You are designing the home-page store recommendation system for a food delivery app such as DoorDash.

A request contains very little context: primarily user_id and the user's current latitude/longitude. The system must return a ranked list of stores for the app home page.

Hard constraints

  • Recommended stores must be within the deliveryable area for the user.
  • Recommended stores must be open at request time .
  • The system is latency-sensitive and powers the home page.

Product goal

Design a recommendation system that maximizes long-term business value, such as orders or contribution profit, while balancing user engagement, relevance, freshness, and system latency. Discuss what primary metric you would optimize for and what guardrail metrics you would monitor.

What to cover

  1. End-to-end architecture
    • Describe the online request flow from request intake to final ranked list.
    • Explain how you would structure retrieval, pre-ranking, ranking, and post-processing.
    • Discuss how you would handle cold-start users, sparse geographies, and new stores.
  2. Retrieval design
    • Propose multiple candidate-generation strategies, given that the online inputs are limited.
    • Explain how you would ensure all candidates satisfy delivery-range and open-now constraints.
    • Discuss how you would merge, deduplicate, and budget candidates across retrieval channels.
  3. Location-aware caching
    • How would you use a geospatial indexing scheme such as Geohash or H3 to support caching?
    • Would you precompute popular stores per grid cell offline?
    • What cache key, TTL, and invalidation strategy would you use, especially when store availability and open status change frequently?
  4. Extreme latency constraint
    • Suppose each retrieval route has a very strict timeout budget, for example 15 ms .
    • How would you optimize parallel fan-out, partial results, fallback behavior, and service-level reliability under such a tight budget?
  5. Ranking and feature platform
    • Design the feature-serving infrastructure for different feature types: embeddings , numeric features , and categorical features .
    • Explain how you would store and serve features keyed by store_id , user_id , and possibly user-store pairs .
    • Assume offline feature pipelines refresh hourly. How would you support high-concurrency online inference while keeping features reasonably fresh and point-in-time correct?
  6. Model versioning and experimentation
    • Model version V2.0 adds new features relative to V1.1.
    • How would your infrastructure support multiple model versions without breaking online serving?
    • How would you configure different treatment groups in an A/B test to fetch different feature sets or model artifacts?
    • What experiment design choices would you make, including randomization unit, success metrics, guardrails, and failure detection?
  7. Real-time versus batch features
    • Discuss the trade-offs between adding real-time features and relying on offline batch features.
    • Under strict latency requirements, what can go wrong if you overuse real-time features?
    • How would you design graceful degradation for feature timeouts, missing values, or upstream instability?

Your answer should explicitly address modeling trade-offs, latency and reliability constraints, experimentation, and common production pitfalls such as training-serving skew, missing features, and marketplace-side side effects.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More DoorDash•More Data Scientist•DoorDash Data Scientist•DoorDash Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.