Explain your ML project end-to-end
Company: Pinterest
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
Pick the most complex ML project on your resume and answer all parts precisely: (1) Define the business objective, target variable, key constraints, and the primary success metric you chose and why (e.g., PR-AUC vs. ROC-AUC vs. cost-weighted error). (2) Describe the data: sources, labeling strategy, train/validation/test splits; if temporal, specify a time-based split and how you prevented leakage (give concrete examples of potential leakage you checked for). (3) Model selection: list candidate models and the exact hyperparameters you tuned; show an ablation plan that isolates the marginal value of two specific feature groups; explain one bias–variance trade-off decision with evidence. (4) Class imbalance: explain your resampling or weighting approach and how you set decision thresholds. Now compute this scenario: on a 10,000-example validation set with 8% positives, the baseline model at threshold 0.50 has precision=0.70 and recall=0.45; after adding Feature Set X and doing probability calibration, at threshold 0.30 you have precision=0.58 and recall=0.66. Compute F1 for both, the expected counts of TP, FP, FN at each threshold, and decide which to deploy if FP costs 1 and FN costs 5—show your cost calculation. (5) Deployment: propose concrete monitoring metrics (at least: calibration, drift on three top features, alert thresholds), a rule for triggering retraining, and how you’d guard against data pipeline schema changes. (6) Online validation: design an A/B test with guardrail metrics, sample-size/duration estimation, and a rollback plan if long-tail segments regress. (7) Post-mortem: name two plausible failure modes and how you would debug them using specific offline error buckets and online slices.
Quick Answer: This question evaluates a data scientist's end-to-end machine learning competencies, including problem framing and metric justification, data sourcing and labeling, model selection and calibration, class-imbalance handling, deployment and monitoring, experimentation design, and post-mortem analysis; it is in the Machine Learning domain and tests both conceptual understanding and practical application across modeling and MLOps. It is commonly asked to assess an interviewee's ability to justify trade-offs, reason about operational constraints such as latency, fairness and cost, design valid evaluation and A/B testing strategies, and define measurable monitoring and rollback criteria.