Design harmful-content evaluation

Q: Design harmful-content evaluation

This question evaluates a data scientist's ability to design a measurement and experimentation framework for content integrity, covering severity definition, signal selection, metric construction, A/B test design, and mitigation of bias and spillovers in harmful-content evaluation.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Q: What difficulty level is this interview question?

This is a medium difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Meta.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Question

You are a Data Scientist working on content integrity at a large social media platform. The team wants to reduce harmful content shown to users, but not all harmful content is equally severe. Examples may include spam, harassment, hate speech, self-harm content, graphic violence, and misinformation.

Design a measurement and experimentation framework for this problem.

Please address the following:

Define severity
- How would you define the severity of harmful content?
- What signals would you use (for example: policy labels, human review, user reports, downstream user harm, virality, repeat exposure, content type)?
- Would you use a binary label, ordinal levels, or a continuous score? Why?
Metrics
- What core metrics would you track to measure harmful content on the platform?
- How would you distinguish between:
  - prevalence of harmful content,
  - exposure to harmful content,
  - severity-weighted exposure,
  - enforcement accuracy,
  - user experience side effects?
- What are the pros and cons of each metric?
- What denominator would you use: content created, content viewed, active users, sessions, or impressions?
Experiment design
- Suppose the team launches a new ranking or detection model intended to reduce harmful-content exposure. How would you evaluate it in an A/B test?
- What would be the primary success metric, guardrail metrics, and possible long-term metrics?
- What should the randomization unit be: user, viewer-session, content, creator, network/community, or geography? Discuss tradeoffs.
- How would you handle interference or spillover effects, given that content can spread across users and social graphs?
Biases and pitfalls
- What sources of selection bias, labeling bias, delayed feedback, or Simpson’s paradox might appear?
- How would you account for rare but very severe harms versus common low-severity harms?
- How would you prevent the team from “improving” one metric while making the platform worse overall?

Your answer should be practical and should explain metric definitions, tradeoffs, experiment design choices, and how you would make a final launch recommendation.

Design harmful-content evaluation

Quick Overview

Solution

Comments (0)