Design harmful-content evaluation
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: medium
Interview Round: Technical Screen
You are a Data Scientist working on content integrity at a large social media platform. The team wants to reduce **harmful content** shown to users, but not all harmful content is equally severe. Examples may include spam, harassment, hate speech, self-harm content, graphic violence, and misinformation.
Design a measurement and experimentation framework for this problem.
Please address the following:
1. **Define severity**
- How would you define the *severity* of harmful content?
- What signals would you use (for example: policy labels, human review, user reports, downstream user harm, virality, repeat exposure, content type)?
- Would you use a binary label, ordinal levels, or a continuous score? Why?
2. **Metrics**
- What core metrics would you track to measure harmful content on the platform?
- How would you distinguish between:
- prevalence of harmful content,
- exposure to harmful content,
- severity-weighted exposure,
- enforcement accuracy,
- user experience side effects?
- What are the pros and cons of each metric?
- What denominator would you use: content created, content viewed, active users, sessions, or impressions?
3. **Experiment design**
- Suppose the team launches a new ranking or detection model intended to reduce harmful-content exposure. How would you evaluate it in an A/B test?
- What would be the primary success metric, guardrail metrics, and possible long-term metrics?
- What should the **randomization unit** be: user, viewer-session, content, creator, network/community, or geography? Discuss tradeoffs.
- How would you handle interference or spillover effects, given that content can spread across users and social graphs?
4. **Biases and pitfalls**
- What sources of selection bias, labeling bias, delayed feedback, or Simpson’s paradox might appear?
- How would you account for rare but very severe harms versus common low-severity harms?
- How would you prevent the team from “improving” one metric while making the platform worse overall?
Your answer should be practical and should explain metric definitions, tradeoffs, experiment design choices, and how you would make a final launch recommendation.
Quick Answer: This question evaluates a data scientist's ability to design a measurement and experimentation framework for content integrity, covering severity definition, signal selection, metric construction, A/B test design, and mitigation of bias and spillovers in harmful-content evaluation.