Explain SHAP and build an ML project
Company: Microsoft
Role: Data Scientist
Category: Machine Learning
Difficulty: easy
Interview Round: Technical Screen
## Part A: SHAP
1. What is SHAP (SHapley Additive exPlanations) trying to measure?
2. How do you interpret:
- A **local** SHAP explanation for a single prediction?
- A **global** SHAP summary plot across many samples?
3. What are common limitations/pitfalls (correlated features, baseline choice, causality vs association, computation)?
## Part B: End-to-end ML project
Describe how you would build a machine learning project end-to-end for a business use case (e.g., churn prediction, fraud detection, recommendations, demand forecasting). Cover:
- Problem framing and success criteria (offline + online)
- Data collection, labeling strategy, and data quality checks
- Feature engineering and leakage prevention
- Train/validation/test splitting strategy (time-based if needed)
- Model selection, tuning, and evaluation
- Deployment, monitoring (data drift + performance), and retraining strategy
- How you communicate tradeoffs to stakeholders
Quick Answer: This question evaluates understanding of model explainability using SHAP and the competency to design and operationalize an end-to-end machine learning project, covering interpretability, limitations, data collection and labeling, feature engineering, model selection and evaluation, deployment, and monitoring within the Machine Learning domain.