Build models for housing and wind power prediction
Company: Citadel
Role: Software Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Take-home Project
Two-part machine-learning OA.
1) Classification: Predict whether someone can buy a house ("can/cannot buy"). Specify your data assumptions, preprocessing, feature design, model choice, validation strategy, and evaluation metrics for this binary prediction.
2) Regression: Predict wind-farm power output from weather recordings. Files: train.csv, test.csv, sample submission.csv. Target: 'power output'. For each record in test.csv, predict 'power output' and submit a CSV (submissions.csv) with a header row and exactly two columns: id, power output. Describe your end-to-end approach (data checks, feature engineering, time-aware validation to avoid leakage, model selection/tuning, metrics) and outline the training/inference pipeline.
Quick Answer: This question evaluates ML system design and applied modeling competencies, including data assumptions, preprocessing, feature engineering, model selection, validation strategies, temporal leakage handling, and end-to-end pipeline construction for a binary housing-affordability classifier and a time-series regression for wind-farm power prediction.