Reduce overfitting under constraints
Company: Boston Consulting Group
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
Your model shows overfitting (train RMSE 4000, valid RMSE 9500). You cannot collect more data, and online inference must stay under 20 ms p95. Choose and prioritize three interventions to reduce overfitting, explain the mechanism for each, and outline an experiment plan: options may include L1/L2/elastic-net regularization (and expected effects on coefficients), early stopping with patience, architecture or tree-depth reduction, feature selection/target encoding with smoothing, data augmentation suitable for tabular data, K-fold cross-validation with stratification, bagging vs boosting, and leakage checks. Specify concrete hyperparameter grids, monitoring metrics, stopping criteria, and how you would establish statistically significant improvement.
Quick Answer: This question evaluates a candidate's competency in machine learning for mitigating overfitting in tabular regression under production latency constraints, testing knowledge of regularization, model complexity control, feature engineering, validation strategies, and experimental design.