Handle challenges in MMM/MMX
Company: CVS Health
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
You inherit a weekly MMM (MMX) over 156 weeks with variables: TV GRPs, Paid Search Spend, Display Spend, Email Sends, Price, Promotions, and a Competitor Index; TV and Search are correlated at 0.90 and there is a pandemic-related structural break in week 70. What factors make this model fragile, and how would you address them? Be specific about endogeneity and omitted variables, multicollinearity remedies (priors, ridge/LASSO, hierarchical Bayesian), adstock/lag and saturation choices, non-stationarity and change points, promotion cannibalization, privacy-induced measurement error (e.g., ATT), and calibration using randomized geo-tests. Describe your validation plan (out-of-time fit, lift alignment, posterior predictive checks) and how you would produce robust ROI and budget recommendations with uncertainty.
Quick Answer: This question evaluates a data scientist's competency in applied Machine Learning and causal inference for marketing-mix modeling, focusing on diagnosing model fragility from multicollinearity, endogeneity, non‑stationarity, measurement error, and decisions around adstock, saturation and promotion effects.