PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Design end-to-end regression for energy demand

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to design and justify an end-to-end regression system for next-day site-level energy demand forecasting, testing competencies in time-series modeling, feature engineering, regularization, evaluation metrics and SLAs, production ML pipelines, monitoring, explainability, and data privacy.

  • hard
  • Amazon
  • Machine Learning
  • Data Scientist

Design end-to-end regression for energy demand

Company: Amazon

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You must build an end-to-end regression system to predict daily site-level electricity consumption (kWh) for a portfolio of commercial buildings. Available data: hourly smart-meter reads; weather (temp, humidity, wind), calendar/holiday flags, building metadata (area, vintage, HVAC type), and optional external features (day-ahead price, outage alerts). Requirements: (1) Formulate the problem and baseline(s); (2) Engineer features to capture seasonality, interactions (e.g., temperature×HVAC), and occupancy proxies; (3) Choose and justify regularization, address multicollinearity, and detect/mitigate heteroscedasticity; (4) Use time-series-aware cross-validation and avoid leakage (be explicit about any lag/rolling constructs); (5) Specify metrics (RMSE, MAPE) and business-facing SLAs (e.g., billing tolerance bands); (6) Handle missing/corrupted sensors and concept drift across years (2019–2025); (7) Productionize: outline training/inference pipelines, model versioning, and monitoring (data/feature drift, residual distribution shifts, retraining triggers); (8) Explainability for non-ML stakeholders (global vs local) and safe failure modes; (9) Security/privacy constraints for tenant data. Finally, propose an ablation plan to quantify the incremental value of external features and describe how you would backtest the full pipeline on 2023–2024 while reserving 2025 for holdout evaluation.

Quick Answer: This question evaluates a candidate's ability to design and justify an end-to-end regression system for next-day site-level energy demand forecasting, testing competencies in time-series modeling, feature engineering, regularization, evaluation metrics and SLAs, production ML pipelines, monitoring, explainability, and data privacy.

Related Interview Questions

  • LLM Fundamentals: Tokenization Design and KL-Regularized SFT - Amazon (medium)
  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
|Home/Machine Learning/Amazon

Design end-to-end regression for energy demand

Amazon logo
Amazon
Oct 13, 2025, 9:49 PM
hardData ScientistOnsiteMachine Learning
4
0

End-to-End Daily Energy Prediction for Commercial Buildings

Context

You are asked to design and justify an end-to-end regression system that predicts next-day daily site-level electricity consumption (kWh) for a portfolio of commercial buildings across multiple years (2019–2025). The system should support forecasting for each building ("site") using:

  • Hourly smart-meter reads
  • Weather: temperature, humidity, wind
  • Calendar and holiday flags
  • Building metadata: floor area, vintage (year built), HVAC type
  • Optional external features: day-ahead electricity price, outage alerts

Assume you must deliver both accurate forecasts and a robust production pipeline suitable for enterprise operations.

Requirements

  1. Formulate the problem and propose baselines.
  2. Engineer features for seasonality, interactions (e.g., temperature × HVAC), and occupancy proxies.
  3. Choose and justify regularization; address multicollinearity; detect and mitigate heteroscedasticity.
  4. Use time-series-aware cross-validation and avoid leakage; be explicit about any lag/rolling constructs.
  5. Specify metrics (e.g., RMSE, MAPE) and business-facing SLAs (e.g., billing tolerance bands).
  6. Handle missing/corrupted sensors and concept drift across years (2019–2025).
  7. Productionize: outline training and inference pipelines, model versioning, and monitoring (data/feature drift, residual shifts, retraining triggers).
  8. Explainability for non-ML stakeholders (global vs. local) and safe failure modes.
  9. Security and privacy constraints for tenant data.
  10. Propose an ablation plan to quantify incremental value of external features and describe a backtest on 2023–2024 with 2025 held out for final evaluation.
Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.