Explain SHAP in an ML System
Company: Microsoft
Role: Data Scientist
Category: Machine Learning
Difficulty: medium
Interview Round: Technical Screen
Describe how you would build an end-to-end machine learning system for a business use case such as churn prediction, ad conversion prediction, or content recommendation.
Walk through the full lifecycle:
1. Business framing and translating the product goal into a prediction task.
2. Label definition, observation window, prediction window, and how to avoid leakage.
3. Data collection, feature engineering, train/validation/test splitting, and handling non-stationarity with time-based evaluation.
4. Baselines, model selection, hyperparameter tuning, and offline metrics.
5. Calibration, fairness, monitoring, drift detection, retraining, and online validation after deployment.
6. How you would explain the model to stakeholders using SHAP.
For the SHAP part, explain:
- what a SHAP value means,
- the difference between local and global explanations,
- how to interpret SHAP summary plots and feature attributions,
- and why correlated features or data leakage can make SHAP interpretations misleading.
Quick Answer: This question evaluates a data scientist's competence in designing and operationalizing end-to-end machine learning systems in the Machine Learning domain, covering problem framing, label and feature design, model selection, evaluation, deployment, monitoring, and model interpretability using SHAP.