Design measurement to detect fake accounts
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: easy
Interview Round: Onsite
## Context
You work on a social platform. The only product surface you can rely on is **friend requests** (sending/receiving/accepting/declining). Assume you have **no existing anti-fake model, no rules, and no established metrics**.
## Task
1. **Define “fake account” operationally.**
- What behaviors qualify (spam, scam, bot, account farming)?
- How will you handle ambiguous/gray accounts?
2. **Design the data and instrumentation.**
- What events and fields would you log for friend requests and subsequent user actions?
- What joins/identifiers are needed to track outcomes over time?
3. **Propose an initial detection approach without a model.**
- What heuristic signals or risk scoring would you start with (rate limits, graph patterns, acceptance ratios, burstiness, messaging-after-accept if available, etc.)?
- How would you choose thresholds and prevent hurting legitimate users?
4. **Measurement & evaluation plan.**
- How will you obtain labels (manual review, user reports, enforcement actions) and deal with delayed/biased labels?
- What are the **primary**, **diagnostic**, and **guardrail** metrics?
5. **Platform-level reporting.**
- How would you estimate and report the platform’s fake-account problem over time (prevalence/incidence), given that you only observe partial ground truth?
- What would you show to an executive audience vs an operational team?
Quick Answer: This question evaluates a data scientist's competencies in fraud detection measurement, event instrumentation, labeling strategy, metric design and reporting using constrained product signals.