Design an email spam detection system
Company: Amazon
Role: Software Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Technical Screen
Design an end-to-end email spam detection system. Cover: problem definition and labeling; data sources and collection (inbox, user reports, honeypots); feature engineering (content, headers, sender reputation, network signals); model choices and training (baseline rules vs. ML, online learning); serving architecture and latency/throughput constraints; thresholding and calibration; evaluation metrics (precision/recall, ROC-PR, cost-weighted metrics); abuse/adversarial defenses and feedback loops; cold start, concept drift, and model retraining cadence; online experimentation (A/B, ramp, guardrails); monitoring, logging, and rollback strategy; privacy and compliance considerations.
Quick Answer: This question evaluates end-to-end ML system design competencies, covering data collection and labeling, feature engineering, model training and calibration, serving architecture, evaluation metrics, monitoring, adversarial defenses, and privacy considerations for large-scale email spam detection.