Measure Harmful Content Impact with Key Metrics
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: medium
Interview Round: Technical Screen
##### Scenario
A social-media platform needs to quantify how serious harmful or inappropriate user-generated content is, and what its impact on users and the business actually is. As a data scientist, you are asked to design the measurement framework.
##### Question
How would you measure the severity and platform impact of harmful content?
1. Which specific metric(s) would you choose as the primary (north-star) measure, and why? Consider candidates such as **View Prevalence**, **Content Prevalence**, and **Reach Prevalence**, and how (if at all) you would incorporate **severity weighting**.
2. What complementary or supporting metrics would you track alongside the primary metric (e.g., user exposure / reach, exposure intensity in the tail, time-weighted exposure, enforcement quality)?
3. Discuss the pros and cons of relying on **View Prevalence alone**. What does it capture well, and what does it hide or get wrong?
4. How would you ensure the measurement is **unbiased and timely** (sampling, human labeling, classifier calibration, confidence intervals, segmentation)?
##### Hints
Tie metrics to user exposure and business risk; compare incidence-based (creator/supply-side) vs. view-weighted (exposure-side) rates; address severity buckets, breadth vs. depth of harm, denominator/window sensitivity, and measurement latency. Distinguish how many *items* are harmful, how many *views* are harmful, and how many *users* are touched.
Quick Answer: A Meta data scientist analytics-screen question on measuring the severity and platform impact of harmful content. It asks you to choose a primary metric (view, content, or reach prevalence, with severity weighting), pick complementary metrics, weigh the pros and cons of view prevalence alone, and design an unbiased, timely estimation approach.