End-to-End Daily Energy Prediction for Commercial Buildings
Context
You are asked to design and justify an end-to-end regression system that predicts next-day daily site-level electricity consumption (kWh) for a portfolio of commercial buildings across multiple years (2019–2025). The system should support forecasting for each building ("site") using:
-
Hourly smart-meter reads
-
Weather: temperature, humidity, wind
-
Calendar and holiday flags
-
Building metadata: floor area, vintage (year built), HVAC type
-
Optional external features: day-ahead electricity price, outage alerts
Assume you must deliver both accurate forecasts and a robust production pipeline suitable for enterprise operations.
Requirements
-
Formulate the problem and propose baselines.
-
Engineer features for seasonality, interactions (e.g., temperature × HVAC), and occupancy proxies.
-
Choose and justify regularization; address multicollinearity; detect and mitigate heteroscedasticity.
-
Use time-series-aware cross-validation and avoid leakage; be explicit about any lag/rolling constructs.
-
Specify metrics (e.g., RMSE, MAPE) and business-facing SLAs (e.g., billing tolerance bands).
-
Handle missing/corrupted sensors and concept drift across years (2019–2025).
-
Productionize: outline training and inference pipelines, model versioning, and monitoring (data/feature drift, residual shifts, retraining triggers).
-
Explainability for non-ML stakeholders (global vs. local) and safe failure modes.
-
Security and privacy constraints for tenant data.
-
Propose an ablation plan to quantify incremental value of external features and describe a backtest on 2023–2024 with 2025 held out for final evaluation.