Predicting Airline Departure Delays — Technical Screen Prompt
Context
You have 15 minutes to review a slide deck on predicting airline departure delays, followed by a 20-minute presentation to the Director of Operations. The current analysis:
-
Uses linear regression with the date encoded as an integer.
-
Includes plots that mix weekdays and months without controlling for seasonality.
Assume you have historical flight-level data (schedules, realized times), route metadata, and access to weather data. The goal is to improve predictive accuracy and translate insights into operational actions.
Tasks
-
Critique the current feature engineering, including:
-
How to encode time (date as categorical or cyclical features).
-
Weather joins and availability.
-
Route- and airport-level effects.
-
Leakage risks (e.g., using wheels-off time).
-
Propose a modeling approach, covering:
-
Whether to predict continuous delay (regression) or classify delay > 15 minutes (or both).
-
Appropriate metrics (e.g., RMSE/MAE for regression; AUC-PR/cost-sensitive metrics for classification).
-
Cross-validation that respects time order.
-
Identify two charts to replace from the current deck, explain the replacements (how and why), and provide a concise narrative for a non-technical audience that links insights to actions (e.g., crew buffers by route and time-of-day).
-
Deliver a concrete recommendation to reduce compensation payouts by 5% without increasing cancellations, and define an experiment or KPI plan to verify impact within four weeks.