Build an end-to-end ML pipeline
Company: Amazon
Role: Machine Learning Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Onsite
Given a CSV with shipment events (order_id, origin, destination, ship_date, promised_date, carrier, weight, item_count, scan_events[], delivered_date), build from scratch a Python pipeline that:
(
1) loads and validates data; handles missing values, outliers, and time zones;
(
2) creates features (e.g., day-of-week, route, carrier stats, dwell times from scan_events);
(
3) labels examples as delayed if delivered_date − promised_date > 48 hours (justify how you handle undelivered items and censoring);
(
4) trains a baseline model (logistic regression or gradient-boosted trees) with cross-validation; reports ROC-AUC and PR-AUC; addresses class imbalance;
(
5) calibrates probabilities and explains top features;
(
6) outputs a CSV of top-K at-risk shipments with calibrated probabilities and reason codes. Optimize for runtime < 5 minutes on 1M rows and memory < 4 GB, and discuss strategies to speed up training/inference and ensure reproducibility.
Quick Answer: This question evaluates the ability to design and implement an end-to-end ML pipeline including data ingestion and validation, feature engineering from raw and nested scan events, labeling and censoring strategies, model training and calibration, performance reporting (ROC-AUC and PR-AUC), and handling production constraints like runtime, memory, and reproducibility in the ML System Design domain. It is commonly asked to probe practical application of machine learning and systems engineering, assess trade-off reasoning for scalable and efficient pipelines, and tests primarily hands-on implementation skills with systems-level conceptual understanding rather than purely theoretical knowledge.