This question evaluates a candidate's competency in end-to-end time-series forecasting pipeline design, covering data cleaning and missing-value handling, anomaly detection and intervention strategies, feature engineering, probabilistic modeling with Unobserved Components Models, rolling-origin backtesting, model comparison, and scaling for multiple related series. It is commonly asked to assess practical and conceptual understanding of Machine Learning and time-series forecasting — including model assumptions, uncertainty quantification, evaluation metrics for quantiles, and productionization considerations — and tests both conceptual understanding and practical application within the Machine Learning / Time-Series Forecasting domain.
You are given 5 years of daily Amazon retail site traffic counts. Design an end-to-end forecasting pipeline that produces 1-, 7-, and 28-day-ahead forecasts along with 10th/50th/90th percentile prediction intervals.
Specify and justify the following:
(a) Data cleaning and missing-value strategies
(b) Anomaly detection and treatment
(c) Feature engineering
(d) Model choice: Unobserved Components Model (UCM)
(e) Rolling-origin backtesting
(f) UCM vs. SARIMA comparison
(g) Scaling to hundreds of related series
Login required