PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Amazon

Design an e-commerce recommendation system

Last updated: Mar 29, 2026

Quick Overview

This question evaluates expertise in large-scale machine learning system design, covering competencies in recommendation algorithms, candidate generation and ranking, feature engineering and stores, low-latency serving, and operational reliability; it is categorized in the ML system design domain and requires both conceptual understanding and practical application. It is commonly asked to assess the ability to balance personalization and business metrics (CTR, CVR, revenue), address scalability and latency targets, and manage data quality, cold-start, exploration–exploitation, A/B testing, bias/fairness, and monitoring trade-offs in production.

  • hard
  • Amazon
  • ML System Design
  • Software Engineer

Design an e-commerce recommendation system

Company: Amazon

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design an Amazon-scale e-commerce product recommendation system. Specify the primary use cases (home feed, product detail, cart, email) and success metrics (CTR, conversion, revenue, diversity, freshness). Propose an architecture with candidate generation and ranking services, including key features (user behavior, item content, graph/co-view signals), model choices (MF, deep retrieval, sequence models, gradient-boosted trees), and a feature store. Detail offline training pipelines, near-real-time updates, online serving, caching, latency/SLA targets, and scale estimates. Address cold start for users/items, long-tail coverage, exploration–exploitation (e.g., bandits), and A/B testing design and guardrails. Discuss data quality, feedback loops, bias/fairness, privacy, abuse prevention, monitoring, and rollback plans.

Quick Answer: This question evaluates expertise in large-scale machine learning system design, covering competencies in recommendation algorithms, candidate generation and ranking, feature engineering and stores, low-latency serving, and operational reliability; it is categorized in the ML system design domain and requires both conceptual understanding and practical application. It is commonly asked to assess the ability to balance personalization and business metrics (CTR, CVR, revenue), address scalability and latency targets, and manage data quality, cold-start, exploration–exploitation, A/B testing, bias/fairness, and monitoring trade-offs in production.

Related Interview Questions

  • Design systems for global request detection and labeling - Amazon (hard)
  • Design a computer-use agent end-to-end - Amazon (medium)
  • Debug online worse than offline model performance - Amazon (medium)
  • Approach an ambiguous business problem - Amazon (medium)
  • Explain parallelism and collectives in training - Amazon (medium)
Amazon logo
Amazon
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
ML System Design
5
0

Design an Amazon-Scale E‑Commerce Product Recommendation System

Context

You are designing a large-scale recommendation system that powers multiple user touchpoints in an e‑commerce platform. The system must handle high traffic and a very large catalog, deliver low-latency personalized recommendations, and be robust to data issues and model drift.

Requirements

  1. Primary Use Cases (surfaces)
    • Home feed (personalized recommendations)
    • Product detail page (PDP: similar/related items, cross‑sell, up‑sell)
    • Cart/checkout (complements, bundles, substitutions)
    • Email/notifications (batch recommendations)
  2. Success Metrics
    • Core: CTR, add-to-cart rate, conversion rate (CVR), revenue per mille (RPM), expected GMV, margin-adjusted revenue
    • Experience: diversity, novelty, freshness/recency, coverage (long tail), personalization lift
    • Reliability: latency p50/p95/p99, error rate
    • Guardrails: bounce rate, returns/cancellations, customer satisfaction proxies (NPS/CSAT), seller/category fairness
  3. Architecture
    • Candidate generation and ranking services
    • Key features: user behavior, item content/metadata, graph/co‑view/co‑buy signals, context
    • Model choices: MF/BPR, deep two‑tower retrieval, sequence models, gradient‑boosted trees, DLRM/MLP re‑rankers
    • Feature store (offline + online, point‑in‑time correctness)
  4. Pipelines and Serving
    • Offline training pipelines and batch feature generation
    • Near‑real‑time updates to features and models where appropriate
    • Online serving, caching strategy, latency/SLA targets, and scale estimates
  5. Special Topics
    • Cold start for users and items
    • Long‑tail coverage and catalog exploration
    • Exploration–exploitation (e.g., bandits)
    • A/B testing design and guardrails
  6. Risk, Compliance, and Ops
    • Data quality, feedback loops, debiasing
    • Bias/fairness and privacy
    • Abuse/fraud prevention
    • Monitoring, alerting, and rollback plans

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.