Forecasting 90+ Day Delinquency Rates for Multifamily Loans: Hierarchical, Leakage-Safe System Design
Context
You need to forecast 90+ day delinquency rates for multifamily loans 1–12 months ahead at both the MSA level and the national aggregate. Coherent forecasts must sum/aggregate correctly across the hierarchy. Data are monthly and include:
-
Loan performance: loan_id, msa, property_type, unpaid_principal (UPB), interest_rate, LTV, DSCR, origination_date, delinquency_status.
-
Property financials: NOI, occupancy (note reporting lags).
-
Macro series: msa_unemployment, CPI, mortgage_rates (potentially revised ex post; consider real-time vintages).
-
Dated policy shocks with known effective dates.
Define the target as the UPB-weighted 90+ day delinquency rate at each horizon.
Task
Design a hierarchical forecasting system that:
-
Coherent forecasts across levels
-
Compare bottom-up vs. MinT reconciliation and explain when each is preferred.
-
Leakage-safe feature engineering
-
Lags, rolling windows, calendar effects, and external regressors.
-
Address real-time vs revised macro data and reporting lags to avoid lookahead bias.
-
Expanding-window cross-validation
-
Use an embargo around splits; prevent cross-sectional leakage when loans migrate across MSAs.
-
Regime shift handling
-
Detect and handle regime shifts (e.g., abrupt policy changes, COVID-like shocks) via change-point tests or covariate-shift diagnostics and ensembling across regimes.
-
Model comparison
-
Compare an XGBoost model with monotonic constraints to a regularized dynamic panel model with fixed effects and AR(1) errors. Describe tuning, selection, and interpretability strategies.
-
Evaluation and monitoring
-
Define WAPE by MSA, calibration of delinquency buckets, CRPS for probabilistic outputs, SHAP stability, drift alerts, and retrain cadence.
Be explicit about avoiding lookahead bias at every step.