PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/Amazon

Build an end-to-end ML pipeline

Last updated: Mar 29, 2026

Quick Overview

This question evaluates the ability to design and implement an end-to-end ML pipeline including data ingestion and validation, feature engineering from raw and nested scan events, labeling and censoring strategies, model training and calibration, performance reporting (ROC-AUC and PR-AUC), and handling production constraints like runtime, memory, and reproducibility in the ML System Design domain. It is commonly asked to probe practical application of machine learning and systems engineering, assess trade-off reasoning for scalable and efficient pipelines, and tests primarily hands-on implementation skills with systems-level conceptual understanding rather than purely theoretical knowledge.

  • hard
  • Amazon
  • ML System Design
  • Machine Learning Engineer

Build an end-to-end ML pipeline

Company: Amazon

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Onsite

Given a CSV with shipment events (order_id, origin, destination, ship_date, promised_date, carrier, weight, item_count, scan_events[], delivered_date), build from scratch a Python pipeline that: ( 1) loads and validates data; handles missing values, outliers, and time zones; ( 2) creates features (e.g., day-of-week, route, carrier stats, dwell times from scan_events); ( 3) labels examples as delayed if delivered_date − promised_date > 48 hours (justify how you handle undelivered items and censoring); ( 4) trains a baseline model (logistic regression or gradient-boosted trees) with cross-validation; reports ROC-AUC and PR-AUC; addresses class imbalance; ( 5) calibrates probabilities and explains top features; ( 6) outputs a CSV of top-K at-risk shipments with calibrated probabilities and reason codes. Optimize for runtime < 5 minutes on 1M rows and memory < 4 GB, and discuss strategies to speed up training/inference and ensure reproducibility.

Quick Answer: This question evaluates the ability to design and implement an end-to-end ML pipeline including data ingestion and validation, feature engineering from raw and nested scan events, labeling and censoring strategies, model training and calibration, performance reporting (ROC-AUC and PR-AUC), and handling production constraints like runtime, memory, and reproducibility in the ML System Design domain. It is commonly asked to probe practical application of machine learning and systems engineering, assess trade-off reasoning for scalable and efficient pipelines, and tests primarily hands-on implementation skills with systems-level conceptual understanding rather than purely theoretical knowledge.

Related Interview Questions

  • Design systems for global request detection and labeling - Amazon (hard)
  • Design a computer-use agent end-to-end - Amazon (medium)
  • Debug online worse than offline model performance - Amazon (medium)
  • Approach an ambiguous business problem - Amazon (medium)
  • Explain parallelism and collectives in training - Amazon (medium)
Amazon logo
Amazon
Jul 17, 2025, 12:00 AM
Machine Learning Engineer
Onsite
ML System Design
3
0

ML System Design: Shipment Delay Risk Scoring From a Single CSV

You are given a CSV of shipment events with the following columns:

  • order_id (string)
  • origin (string)
  • destination (string)
  • ship_date (string/datetime)
  • promised_date (string/datetime)
  • carrier (string)
  • weight (float)
  • item_count (int)
  • scan_events (JSON array encoded as string; each element typically has a timestamp and status)
  • delivered_date (string/datetime; may be null if undelivered)

Build a Python pipeline from scratch that:

  1. Loads and validates data, handling missing values, outliers, and time zones.
  2. Creates features (e.g., day-of-week, route, carrier stats via target encoding, and dwell times from scan_events).
  3. Labels examples as delayed if delivered_date − promised_date > 48 hours. Justify and implement how you handle undelivered items and censoring.
  4. Trains a baseline model (logistic regression or gradient-boosted trees) with cross-validation; reports ROC-AUC and PR-AUC; addresses class imbalance.
  5. Calibrates probabilities and explains top features.
  6. Outputs a CSV of top-K at-risk shipments with calibrated probabilities and reason codes.

Constraints:

  • Optimize for runtime < 5 minutes on 1M rows and memory < 4 GB on CPU.
  • Discuss strategies to speed up training/inference and ensure reproducibility.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Amazon•More Machine Learning Engineer•Amazon Machine Learning Engineer•Amazon ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.