How would you evaluate stolen-post detection?
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
You are interviewing for a Meta DSA (product analytics / data science) role. The product team is launching a new **Stolen Post Detection** algorithm that flags posts suspected of being copied/reposted without attribution, and then triggers actions (e.g., downrank, warning label, creator notification, or removal).
Design an evaluation plan covering:
1) **Problem diagnosis & clarification:** What questions would you ask to clarify the product goal and the meaning of “stolen” (e.g., exact duplicate vs paraphrase vs meme templates), enforcement actions, and success criteria?
2) **Harms & tradeoffs:** Enumerate likely failure modes and harms of false positives vs false negatives, including different stakeholder impacts (original creator, reposter, viewers, moderators).
3) **Metrics:** Propose a metric framework with (a) primary success metrics, (b) guardrails, and (c) offline model metrics. Include at least one metric that can move in opposite directions depending on threshold choice.
4) **Experiment design:** Propose an online experiment (or quasi-experiment if A/B is hard). Address logging, unit of randomization, interference/network effects, ramp strategy, and how you would compute/think about power/MDE.
5) **Post-launch monitoring:** What would you monitor to detect regressions or gaming, and how would you iterate on thresholds/policy over time?
Quick Answer: This question evaluates product analytics, experimental design, and causal thinking for content-moderation algorithms—specifically metric specification, trade-off/harm analysis, and online experiment logistics—and is commonly asked to gauge a data scientist’s ability to balance detection accuracy, stakeholder impacts, and business objectives in production features; it is in the Analytics & Experimentation category for a Data Scientist position. At a high abstraction level it probes system-level reasoning around problem scoping, failure modes, metric frameworks, A/B or quasi-experiment setup, and post-launch monitoring without requiring implementation-level detail.