Explain a favorite model end-to-end
Company: Google
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Onsite
Pick one predictive model you know deeply (e.g., logistic regression, gradient-boosted trees, transformer classifier) and explain how it works end-to-end for a real problem you solved.
(a) State the objective, loss function, and the model’s inductive biases/assumptions; when are they violated?
(b) Describe feature engineering and your validation strategy (i.i.d. vs. time-based splits); how did you prevent leakage and confirm stationarity?
(c) Walk through training: hyperparameter search, regularization, early stopping, handling class imbalance (weights, focal loss, resampling). Justify choices quantitatively.
(d) Detail three concrete training/inference issues you encountered (e.g., covariate shift, label noise, calibration drift, skew between offline and online features, latency/throughput limits). How did you detect, diagnose, and fix each (checks/plots/metrics)?
(e) Explain evaluation beyond ROC/PR: calibration, cost-sensitive metrics, business KPIs, and how you translated model lift into expected value.
(f) Discuss fairness, privacy, and post-deployment monitoring: drift detection thresholds, alerting, rollback criteria, and canarying.
Quick Answer: This question evaluates a candidate's end-to-end machine learning competencies, including problem framing, objective and loss selection, inductive biases, feature engineering, validation strategies, training and regularization practices, production issues, monitoring, and fairness/privacy considerations.