This question evaluates the ability to design and implement an end-to-end ML pipeline including data ingestion and validation, feature engineering from raw and nested scan events, labeling and censoring strategies, model training and calibration, performance reporting (ROC-AUC and PR-AUC), and handling production constraints like runtime, memory, and reproducibility in the ML System Design domain. It is commonly asked to probe practical application of machine learning and systems engineering, assess trade-off reasoning for scalable and efficient pipelines, and tests primarily hands-on implementation skills with systems-level conceptual understanding rather than purely theoretical knowledge.
You are given a CSV of shipment events with the following columns:
Build a Python pipeline from scratch that:
Constraints:
Login required