Diagnose and fix linear regression assumption breaks
Company: Citibank
Role: Data Scientist
Category: Machine Learning
Difficulty: medium
Interview Round: Take-home Project
Quick Answer: This question evaluates a data scientist's competency in linear regression diagnostics and remedial modeling, covering core OLS assumptions (linearity, no perfect multicollinearity, exogeneity, homoskedasticity, error independence and normality), diagnostics and remedies for heteroskedasticity and severe multicollinearity, and method selection among ridge/LASSO and GLMs. Commonly asked in the Machine Learning and statistical modeling domain because it probes detection of assumption violations, interpretation of impacts on standard errors, confidence intervals and hypothesis tests, and the balance between conceptual inference and practical model refitting, assessing both conceptual understanding and practical application.