Evaluate and Experiment with Harmful Content Detection Model

Q: Evaluate and Experiment with Harmful Content Detection Model

This question evaluates a candidate's competence in machine learning model evaluation and online experiment design for content moderation, testing skills such as handling class imbalance, probabilistic scoring and calibration, threshold selection, slice-based robustness checks, and production experiment planning.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Evaluating a Harmful-Content Detection Model: Offline and Online

Context

You are given a binary classification model that detects harmful content in a social platform and flags items for either removal or down‑ranking. You need to:

Evaluate the model offline on a labeled validation set.
Design an online experiment to test the model in production.

Assume class imbalance (harmful content is rare), probabilistic model outputs (scores), and that some actions (auto‑remove) can prevent us from observing true labels unless we design around it.

Tasks

Offline evaluation (labeled validation set):
- Define and compute core metrics (precision, recall, FPR, ROC/PR curves, AUCs).
- Assess calibration and choose an operating threshold given policy and cost trade‑offs.
- Check robustness across slices (e.g., language/region) and over time.
Online experiment design:
- State hypotheses.
- Define variants (control vs. treatment), including any shadow/canary ramps.
- Specify randomization unit, traffic split, duration, and significance plan.
- Define primary success metrics and guardrails (safety, engagement, fairness, latency).
- Address measurement challenges (delayed/hidden labels due to enforcement).

Evaluate and Experiment with Harmful Content Detection Model

Evaluating a Harmful-Content Detection Model: Offline and Online

Context

Tasks

Solution

Comments (0)