Predicting Home Sale Prices: End-to-End ML Design
Context
You have historical home-sale records with features such as lot area, year built, number of rooms, neighborhood, and sale date. You need to build a production-ready system that predicts sale price for new listings.
Task
Design an end-to-end approach that covers:
-
Problem framing and target transformation (e.g., log-price) and objective.
-
Data cleaning: missing-value imputation, outlier handling, and leakage risks.
-
Feature engineering for numeric, categorical, time, and location features; useful interactions.
-
Baselines and model choices: linear models, regularized regressions, tree-based methods, ensembles.
-
Evaluation protocol and metrics: RMSE, MAPE, and time-based cross-validation.
-
Hyperparameter tuning and error analysis.
-
Interpretability and fairness considerations.
-
Handling non-stationarity and market shifts: monitoring and retraining.
-
Deployment plan for a lightweight model under latency and memory constraints.