Evaluate New Model's Performance Against Existing System
Company: Meta
Role: Data Scientist
Category: Machine Learning
Difficulty: medium
Interview Round: Technical Screen
##### Scenario
A new machine-learning model flags harmful posts; leadership wants evidence it outperforms the old system.
##### Question
How would you evaluate the performance of the new harmful-content detection model versus the existing model or no model? Describe both offline evaluation (confusion matrix metrics) and online A/B testing approaches, addressing precision-recall trade-offs.
##### Hints
Mention metrics (precision, recall, F1, ROC), calibration, business KPIs, guardrails, and experiment design.
Quick Answer: This question evaluates a data scientist's competency in machine-learning model evaluation and experimentation, covering metrics (precision, recall, F1/Fβ, ROC-AUC, PR-AUC, calibration), dataset and label quality, class imbalance handling, thresholding/triage policies, and safety and ethical guardrails.