PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Capital One

Build and evaluate airline delay prediction model

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's machine learning competencies including target definition, leakage-aware feature engineering, temporal splitting and backtesting, model comparison and hyperparameter tuning, cost-sensitive evaluation, and production constraints such as latency, model size, monitoring, and retraining.

  • Medium
  • Capital One
  • Machine Learning
  • Data Scientist

Build and evaluate airline delay prediction model

Company: Capital One

Role: Data Scientist

Category: Machine Learning

Difficulty: Medium

Interview Round: Technical Screen

You are given several CSVs for the classic airline delay challenge with columns like flight_date, carrier, flight_num, origin, dest, sched_dep, sched_arr, dep_delay_min, arr_delay_min, distance, aircraft_type, weather_features_*, and holiday_flag. a) Define a binary target and justify it: e.g., late_arrival = arr_delay_min > 15. b) Detail a leakage-aware feature set: include weather forecasts at origin/dest, route history aggregates up to t−7 days, time-of-day, day-of-week, month, distance, carrier- and airport-level rolling stats; exclude or properly lag any features that encode future information (e.g., actual arrival times). c) Specify a time-based split (e.g., train up to 2024-06, validate 2024-07–2024-09, test 2024-10–2025-03), class imbalance handling, and primary metrics (PR-AUC, calibrated Brier). d) Compare a strong baseline (regularized logistic regression with target encoding) versus gradient boosting (e.g., XGBoost/LightGBM): hyperparameters to search, early stopping, monotonic constraints if used. e) Explain how you would do rolling-origin cross-validation and backtesting of threshold policies (e.g., proactive swaps or buffers) with cost-sensitive evaluation that prices false negatives at 5× false positives. f) Productionization: 20 ms/flight latency budget, 50 MB model size, feature store vs on-the-fly aggregation, drift detection, and periodic retraining cadence. g) Deliverables: reproducible notebook, clean data pipeline, model cards with fairness slices across carriers/airports, and an exec summary with recommended operational policy and estimated ROI.

Quick Answer: This question evaluates a data scientist's machine learning competencies including target definition, leakage-aware feature engineering, temporal splitting and backtesting, model comparison and hyperparameter tuning, cost-sensitive evaluation, and production constraints such as latency, model size, monitoring, and retraining.

Related Interview Questions

  • Deep-dive XGBoost handling and overfitting - Capital One (medium)
  • Build House Price Model Responsibly - Capital One (easy)
  • Design robber detection from surveillance video - Capital One (easy)
  • How would you design delay and watchlist models? - Capital One (medium)
  • Explain core ML concepts and lifecycle - Capital One (medium)
Capital One logo
Capital One
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
6
0

You are given several CSVs for the classic airline delay challenge with columns like flight_date, carrier, flight_num, origin, dest, sched_dep, sched_arr, dep_delay_min, arr_delay_min, distance, aircraft_type, weather_features_*, and holiday_flag. a) Define a binary target and justify it: e.g., late_arrival = arr_delay_min > 15. b) Detail a leakage-aware feature set: include weather forecasts at origin/dest, route history aggregates up to t−7 days, time-of-day, day-of-week, month, distance, carrier- and airport-level rolling stats; exclude or properly lag any features that encode future information (e.g., actual arrival times). c) Specify a time-based split (e.g., train up to 2024-06, validate 2024-07–2024-09, test 2024-10–2025-03), class imbalance handling, and primary metrics (PR-AUC, calibrated Brier). d) Compare a strong baseline (regularized logistic regression with target encoding) versus gradient boosting (e.g., XGBoost/LightGBM): hyperparameters to search, early stopping, monotonic constraints if used. e) Explain how you would do rolling-origin cross-validation and backtesting of threshold policies (e.g., proactive swaps or buffers) with cost-sensitive evaluation that prices false negatives at 5× false positives. f) Productionization: 20 ms/flight latency budget, 50 MB model size, feature store vs on-the-fly aggregation, drift detection, and periodic retraining cadence. g) Deliverables: reproducible notebook, clean data pipeline, model cards with fairness slices across carriers/airports, and an exec summary with recommended operational policy and estimated ROI.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Capital One•More Data Scientist•Capital One Data Scientist•Capital One Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.