PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Amazon

Choose Models for Imbalanced Data and Time-Series Forecasting

Last updated: Mar 29, 2026

Quick Overview

Choose Models for Imbalanced Data and Time-Series Forecasting evaluates core ML concepts, assumptions, math intuition, training/evaluation trade-offs, and practical failure modes in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • hard
  • Amazon
  • Machine Learning
  • Data Scientist

Choose Models for Imbalanced Data and Time-Series Forecasting

Company: Amazon

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

##### Scenario You are asked to choose and tune models for forecasting marketplace demand and detecting fraud in a highly imbalanced dataset. ##### Question Explain how ordinary least squares linear regression works and state its key assumptions. Compare gradient-boosted trees, random forests, and bagging; when would you prefer each? Your positive class is 0.2 % of the data. How would you handle this imbalance during model training and evaluation? Describe a full workflow for building a time-series forecasting model when seasonality and trend are present. ##### Hints Cover data preprocessing, feature engineering, resampling/weighting, proper metrics, and cross-validation for temporal data.

Quick Answer: Choose Models for Imbalanced Data and Time-Series Forecasting evaluates core ML concepts, assumptions, math intuition, training/evaluation trade-offs, and practical failure modes in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Related Interview Questions

  • LLM Fundamentals: Tokenization Design and KL-Regularized SFT - Amazon (medium)
  • Predicting the Next Elevator Call Location - Amazon (medium)
  • Explain Transformer and MoE Fundamentals - Amazon (medium)
  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
|Home/Machine Learning/Amazon

Choose Models for Imbalanced Data and Time-Series Forecasting

Amazon logo
Amazon
Aug 4, 2025, 10:55 AM
hardData ScientistTechnical ScreenMachine Learning
54
0

Choose Models for Imbalanced Data and Time-Series Forecasting

Scenario

You must choose and tune models for (a) forecasting marketplace demand with seasonality and trend, and (b) detecting fraud where the positive class rate is only 0.2%.

Tasks

  1. Ordinary Least Squares (OLS): Explain how OLS linear regression works and list its key assumptions.
  2. Tree Ensembles: Compare gradient-boosted trees, random forests, and bagging. When would you prefer each?
  3. Class Imbalance (0.2% positive): How would you handle this imbalance during model training and evaluation?
  4. Time-Series Forecasting Workflow: Describe a full, practical workflow for modeling a series with trend and seasonality, including preprocessing, feature engineering, appropriate metrics, and time-aware cross-validation.

Hint

Address data preprocessing, feature engineering, resampling/weighting, proper metrics for imbalance, and cross-validation suited for temporal data.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify the task, data shape, labels, constraints, and evaluation metric.
  • State assumptions behind the math or modeling technique you choose.
  • Connect theory to practical training, debugging, and deployment implications.

What a Strong Answer Covers

  • Correct definitions and formulas where the prompt requires them.
  • A practical explanation of how the method behaves on real data.
  • Trade-offs, failure modes, diagnostics, and mitigation strategies.
  • Evaluation choices that match the product or modeling objective.

Follow-up Questions

  • How would noisy labels, class imbalance, or distribution shift affect the answer?
  • What would you monitor after deployment?
  • Which baseline would you compare against first?
Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.