Define metrics for harmful-content severity

Q: Define metrics for harmful-content severity

This question evaluates a data scientist's skills in metric design, measurement and analytics for content integrity within the Analytics & Experimentation domain, requiring definition of severity metrics, units of analysis, labeling rules, and a KPI suite.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

Context

You are a Data Scientist working on integrity/harmful-content for a social media product. The company wants a single “severity” metric (or metric suite) to track how bad policy-violating content is on the platform over time and to evaluate integrity interventions (ranking changes, enforcement, classifiers, human review).

Problem

Propose metrics to measure the severity of violating/harmful content on the platform.
- Define what each metric means.
- Specify the unit of analysis (content item, user, impression/view, session, day).
- Clarify what counts as “violation” (e.g., policy-violating content confirmed by human review or high-confidence classifier).
The team suggests using View Prevalence as the main KPI:
- Example definition:
  View Prevalence (VP) = (views/impressions of violating content) / (all views/impressions).
- Discuss pros and cons of View Prevalence as a primary severity metric.
Discuss key tradeoffs when choosing/optimizing these metrics.
- Include at least: user safety vs engagement, precision vs recall, reporting robustness vs sensitivity to change, and fairness/coverage across regions/languages.
If you were asked to recommend a final metric suite, which metric would you pick as the primary KPI , and what would be your diagnostic and guardrail metrics ?

Assume you have:

Impression/view logs, content metadata, policy labels from human review (partial coverage), and ML classifier scores.
Interventions can change both the amount of violating content and the distribution of views across content.

Define metrics for harmful-content severity

Context

Problem

Solution

Comments (0)

Define metrics for harmful-content severity

Overview

Context

Problem

Solution

Comments (0)