Design a hierarchical MF delinquency forecasting system
Company: Freddie Mac
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Onsite
You're tasked with forecasting 90+ day delinquency rates for multifamily loans at both the MSA level and national aggregate 1–12 months ahead. Data available: monthly loan performance (loan_id, msa, property_type, unpaid_principal, interest_rate, LTV, DSCR, origination_date, delinquency_status), property financials (NOI, occupancy), macro series (msa_unemployment, CPI, mortgage_rates), and dated policy shocks. Design a hierarchical forecasting system that: (a) produces coherent forecasts across levels (compare bottom-up vs MinT reconciliation and when each is preferred), (b) engineers leakage-safe features (lags, rolling windows, calendar effects, external regressors; address real-time vs revised macro data), (c) uses expanding-window CV with an embargo while preventing cross-sectional leakage when loans migrate across MSAs, (d) detects/handles regime shifts (e.g., abrupt policy change or COVID-like shock) via changepoint tests or covariate-shift diagnostics and ensembling across regimes, (e) compares an XGBoost model with monotonic constraints to a regularized dynamic panel model with FE and AR(1) errors—state tuning, selection, and interpretability strategy, and (f) defines evaluation/monitoring: WAPE by MSA, calibration of delinquency buckets, CRPS for probabilistic outputs, SHAP stability, drift alerts, and retrain cadence. Be explicit about avoiding lookahead bias at every step.
Quick Answer: This question evaluates competency in hierarchical time-series forecasting, leakage-safe feature engineering and validation, regime-shift detection, model comparison and reconciliation, and production-ready evaluation and monitoring for multifamily loan delinquency prediction while explicitly addressing lookahead bias.