Take‑Home: Classifying Buy‑Now vs Wait Decisions in Housing Time Series
Context
You are given a monthly panel of regional housing and macro time series (e.g., price indices, mortgage rates, inventory, days‑on‑market, unemployment, CPI). The goal is to build a system that, for each region and month t, outputs a calibrated probability and a recommendation: buy now vs wait (i.e., buy within the next k months).
Task
Describe, at design level and with enough specificity to implement:
-
Target and horizon
-
Define the decision horizon k and a rigorous target label y_t for month t.
-
Clarify economic assumptions and edge cases (e.g., transaction costs, right‑censoring).
-
Data preprocessing
-
Panel alignment by region and month, handling multiple data vintages if applicable.
-
Missing‑value strategy, outliers, scaling, and seasonality/deflation adjustments.
-
Temporal feature engineering
-
Lags, rolling statistics, deltas (m/m, y/y), seasonality dummies, and interaction features.
-
Handling non‑stationarity (e.g., differencing, deflation, time‑weighted fitting).
-
Time‑aware validation
-
Train/validation/test splits that respect time.
-
Walk‑forward (rolling/expanding window) cross‑validation and hyperparameter tuning.
-
Models
-
Baselines and candidate models (e.g., logistic regression with time features, gradient boosting, sequence models).
-
Rationale for choices given data size, interpretability, and regime risk.
-
Metrics and decisioning
-
Probabilistic metrics (AUC, Brier, calibration) and cost‑sensitive objectives reflecting asymmetric risks.
-
Derive a thresholding rule tied to user costs/utilities.
-
Leakage controls
-
Methods to prevent look‑ahead bias and data leakage (including macro data release lags and revisions).
-
Concept drift and monitoring
-
How to detect, diagnose, and handle drift post‑deployment; retraining cadence.
-
User presentation
-
How to present a calibrated probability and recommendation to end users, including explanations and scenario analysis.