How to measure harmful-content severity and run experiments

Q: How to measure harmful-content severity and run experiments

This is a Analytics & Experimentation interview question from Meta for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

Context

You are a Data Scientist on a social media platform working on harmful content (e.g., hate/harassment, self-harm, violence, sexual exploitation, misinformation). A team proposes a new intervention (e.g., ranking demotion, removal policy change, or a new ML classifier + enforcement workflow).

Task

Define “severity of harmful content.”
- Propose a severity framework that supports both measurement and decision-making.
- Explain tradeoffs (e.g., interpretability vs. sensitivity, policy alignment, subjectivity, multilingual considerations).
Design metrics (primary/diagnostic/guardrails).
- Provide at least:
  - One primary metric capturing harmful content impact.
  - Several diagnostic metrics that help explain movement.
  - Several guardrail metrics to detect unintended harm.
- Specify exact definitions (numerators/denominators) and any weighting (e.g., by exposure).
Design an experiment to evaluate the intervention.
- Choose an appropriate randomization unit (viewer/user, author, content item, session, community/cluster, geo, etc.) and justify it.
- Discuss common pitfalls: interference/spillover, network effects, contamination, novelty effects, delayed outcomes, and measurement error.
- Explain how you would analyze results (e.g., intent-to-treat vs. per-protocol, variance reduction, segmentation, multiple testing).
Pros/cons and edge cases.
- Highlight failure modes: label noise, policy changes during the test, adversarial behavior, reporting bias, and differences across regions/languages.

Assumptions

You can log impressions/views, engagement, reports, enforcement actions, and model outputs.
“Harmful content” is determined by a combination of policy rules, human review, and ML signals (imperfect).

How to measure harmful-content severity and run experiments

Context

Task

Assumptions

Solution

Comments (0)