How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a Medium difficulty Machine Learning question, commonly asked during Technical Screen rounds at Capital One.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Capital One during technical interviews.

Build and evaluate airline delay prediction model | Capital One Interview Question

Quick Overview

This question evaluates a data scientist's machine learning competencies including target definition, leakage-aware feature engineering, temporal splitting and backtesting, model comparison and hyperparameter tuning, cost-sensitive evaluation, and production constraints such as latency, model size, monitoring, and retraining.

You are given several CSVs for the classic airline delay challenge with columns like flight_date, carrier, flight_num, origin, dest, sched_dep, sched_arr, dep_delay_min, arr_delay_min, distance, aircraft_type, weather_features_*, and holiday_flag. a) Define a binary target and justify it: e.g., late_arrival = arr_delay_min > 15. b) Detail a leakage-aware feature set: include weather forecasts at origin/dest, route history aggregates up to t−7 days, time-of-day, day-of-week, month, distance, carrier- and airport-level rolling stats; exclude or properly lag any features that encode future information (e.g., actual arrival times). c) Specify a time-based split (e.g., train up to 2024-06, validate 2024-07–2024-09, test 2024-10–2025-03), class imbalance handling, and primary metrics (PR-AUC, calibrated Brier). d) Compare a strong baseline (regularized logistic regression with target encoding) versus gradient boosting (e.g., XGBoost/LightGBM): hyperparameters to search, early stopping, monotonic constraints if used. e) Explain how you would do rolling-origin cross-validation and backtesting of threshold policies (e.g., proactive swaps or buffers) with cost-sensitive evaluation that prices false negatives at 5× false positives. f) Productionization: 20 ms/flight latency budget, 50 MB model size, feature store vs on-the-fly aggregation, drift detection, and periodic retraining cadence. g) Deliverables: reproducible notebook, clean data pipeline, model cards with fairness slices across carriers/airports, and an exec summary with recommended operational policy and estimated ROI.

Quick Overview

Build and evaluate airline delay prediction model

Quick Overview

Comments (0)

Build and evaluate airline delay prediction model

Quick Overview

Comments (0)