PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/ML System Design/Amazon

Design an email spam detection system

Last updated: Apr 20, 2026

Quick Overview

This interview question evaluates ML product requirements, data/labeling, modeling, serving architecture, evaluation, monitoring, and trade-offs in a realistic interview setting. A strong answer for Design an email spam detection system states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • hard
  • Amazon
  • ML System Design
  • Software Engineer

Design an email spam detection system

Company: Amazon

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design an end-to-end email spam detection system. Cover: problem definition and labeling; data sources and collection (inbox, user reports, honeypots); feature engineering (content, headers, sender reputation, network signals); model choices and training (baseline rules vs. ML, online learning); serving architecture and latency/throughput constraints; thresholding and calibration; evaluation metrics (precision/recall, ROC-PR, cost-weighted metrics); abuse/adversarial defenses and feedback loops; cold start, concept drift, and model retraining cadence; online experimentation (A/B, ramp, guardrails); monitoring, logging, and rollback strategy; privacy and compliance considerations.

Quick Answer: This interview question evaluates ML product requirements, data/labeling, modeling, serving architecture, evaluation, monitoring, and trade-offs in a realistic interview setting. A strong answer for Design an email spam detection system states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Related Interview Questions

  • Design systems for global request detection and labeling - Amazon (hard)
  • Design a computer-use agent end-to-end - Amazon (medium)
  • Debug online worse than offline model performance - Amazon (medium)
  • Approach an ambiguous business problem - Amazon (medium)
  • Explain parallelism and collectives in training - Amazon (medium)
|Home/ML System Design/Amazon

Design an email spam detection system

Amazon logo
Amazon
Aug 10, 2025, 12:00 AM
hardSoftware EngineerTechnical ScreenML System Design
14
0

Design an email spam detection system

System Design: End-to-End Email Spam Detection

Context

Design an end-to-end system that detects and handles spam emails at scale. Assume you are building for a large consumer email service handling high throughput and strict latency requirements. The design should cover data, ML, serving, experimentation, and operations.

Requirements

  1. Problem Definition and Labeling
    • Define the objective(s) and action outcomes (e.g., block, quarantine, inbox with banner).
    • Labeling sources and policies.
  2. Data Sources and Collection
    • Inbound traffic, user reports, honeypots, abuse teams, reputation feeds.
    • Collection, sampling, retention, and governance.
  3. Feature Engineering
    • Content features (text, URLs, attachments), headers, sender/domain/IP reputation, network/behavioral signals.
  4. Model Choices and Training
    • Baseline rules, supervised ML models, online learning.
    • Handling class imbalance, feature hashing, model calibration.
  5. Serving Architecture and Constraints
    • Placement in the mail pipeline, APIs, latency/throughput targets, caching, fallbacks.
  6. Thresholding and Calibration
    • Score-to-action mapping, per-segment thresholds, calibration methods.
  7. Evaluation Metrics
    • Precision, recall, ROC/PR analysis, and cost-weighted metrics.
  8. Abuse/Adversarial Defenses and Feedback Loops
    • Evasion tactics, spoofing defenses, URL/attachment handling, user feedback integration.
  9. Cold Start, Concept Drift, Retraining Cadence
    • New senders/domains, seasonal drift, automated retraining.
  10. Online Experimentation
    • A/B testing, ramp strategies, guardrails.
  11. Monitoring, Logging, Rollback
    • Real-time and batch monitoring, alerting, safe rollback.
  12. Privacy and Compliance
    • Data minimization, encryption, regional residency, user controls.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify users, core use cases, read/write patterns, scale, latency, availability, and data retention.
  • State explicit assumptions before making sizing or architecture decisions.
  • Prioritize the functional path first, then address reliability, security, observability, and rollout.

What a Strong Answer Covers

  • A scoped requirements summary with concrete non-goals and success metrics.
  • ML-specific data, model, evaluation, serving, and monitoring choices.
  • Reasoned trade-offs among simple and scalable designs, including bottlenecks and failure modes.
  • A validation, monitoring, migration, and launch plan appropriate for the risk level.

Follow-up Questions

  • What breaks first at 10x traffic or data volume?
  • How would you degrade gracefully during dependency failures?
  • What metrics and alerts would prove the design is healthy after launch?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon ML System Design•Software Engineer ML System Design

Your design canvas — auto-saved

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.