Harmful-Content Detection: Measurement Plan and Experiment Design
Objective
You are launching a new harmful-content detection system and must define how to measure its impact on user harm and platform health.
Tasks
-
Propose metrics that measure both the severity and the prevalence of inappropriate content. For each metric, explain why it was chosen and list pros/cons.
-
Define and justify the View Prevalence metric.
-
Design an online A/B experiment to evaluate the new model. Include:
-
Hypotheses
-
Primary success metric
-
Guardrail metrics
-
Sample-size and runtime estimations
-
Steps to analyze and interpret results
Hints
-
Tie metrics explicitly to user harm; balance severity and frequency.
-
Include power analysis, segment checks, and risk mitigations.