Design features for house price prediction

Q: Design features for house price prediction

This is a Machine Learning interview question from Two Sigma for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Loading...

Scenario

You are building a model to predict house sale price from a tabular dataset (similar to typical real-estate datasets). The interviewer expects a simple baseline model (e.g., linear regression), but wants to understand your reasoning.

Questions

Which features are likely to be predictive of house price, and why? (Examples: location, size, age, condition, amenities, nearby schools, etc.)
How do you decide which features are usable (available at prediction time, not leaking label information, stable definitions)?
What data cleaning steps would you perform before modeling?
If starting with linear regression , how would you:
- handle missing values,
- handle categorical variables,
- reduce the impact of outliers/skewed price distributions,
- detect multicollinearity and mitigate it?
How would you evaluate the model and iterate on improvements?

Assume you have a training set with historical sales and a holdout set for evaluation.

Design features for house price prediction

Scenario

Questions

Comments (0)