Design metrics and experiment for stolen-post detection

Q: Design metrics and experiment for stolen-post detection

This is a Analytics & Experimentation interview question from Meta for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

Loading...

You work on Stolen Post Detection for a social platform (detecting content that is copied/reposted without permission).

A new detection algorithm is proposed (e.g., a model producing a stolen-probability score used to downrank, label, or block posts).

Questions

Problem framing & diagnostics
- What are the key failure modes and risks (false positives vs false negatives) for stolen-post detection?
- If stakeholders report “stolen posts are down,” what would you check to validate whether this is real vs an artifact (measurement issues, reporting changes, seasonality, policy changes, spam shifts, etc.)?
Metrics Propose:
- Primary success metric(s) (what you ultimately want to improve)
- Diagnostic metrics (to understand why things moved)
- Guardrail metrics (to prevent harm)
Include at least one metric that handles delayed / noisy ground truth (since “stolen” labels may come from user reports, manual review, or appeals).
Experiment design Design an online experiment (A/B test or alternative) to evaluate the new algorithm. Address:
- Randomization unit (post-level vs author-level vs viewer-level) and why
- Interference / network effects (e.g., copied content affects multiple creators)
- Exposure definition (who is affected by the change)
- Sample size / power considerations at a high level (what drives variance)
- Ramp plan and decision criteria
Tradeoffs and decision If offline metrics improve (e.g., higher precision/recall on labeled data) but online engagement drops, how would you decide what to launch and what follow-ups you’d run?

Design metrics and experiment for stolen-post detection

Questions

Comments (0)