This question evaluates a data scientist's mastery of end-to-end supervised regression for time-indexed panel data, including time-series and seasonal feature engineering, weather-dependent modeling, robust baselines, time-aware cross-validation, residual analysis, leakage prevention, and production concerns such as deployment, monitoring, and retraining triggers. It is commonly asked in Machine Learning interviews to assess practical application skills in building production-ready predictive systems rather than purely conceptual understanding, emphasizing model validation, operational monitoring, and handling of panel/time-series challenges.
You need to build and productionize a supervised regression model that predicts next-day daily energy consumption (kWh) for utility clients (a panel of customers/meters). The data is time-indexed and exhibits strong seasonality and weather dependence.
Describe, in order, the exact steps you would take from raw data to a monitored production system, covering:
Login required