PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Analytics & Experimentation/Meta

Design experiment for fake accounts impact

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in experimental design and causal inference for networked social platforms, including unit and cluster randomization choices, metric selection and exposure windows, power estimation, interference diagnostics, treatment misclassification handling, ethical ramping, and quasi-experimental backups.

  • hard
  • Meta
  • Analytics & Experimentation
  • Data Scientist

Design experiment for fake accounts impact

Company: Meta

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Onsite

Your team will remove detected fake accounts and wants to estimate causal impact on real users' experience. Design an end-to-end experiment plan that addresses network interference and product trade-offs. Be specific: 1) Experiment unit and randomization: Choose and justify between user-level, ego-network cluster, or geography-level randomization. Describe how you would construct clusters to minimize cross-treatment contamination while maintaining power. 2) Primary and guardrail metrics: Specify exact definitions (e.g., comments_per_view, 7-day retention of real accounts, abuse reports per 1K views). Define metric windows and whether they are exposure- or calendar-based. 3) Power and duration: Provide a concrete back-of-envelope sample-size calculation assuming a 0.5% relative change in comments_per_view, baseline 0.12, overdispersion, and an intra-cluster correlation of 0.02. 4) Interference diagnostics: Propose two separate tests to quantify spillovers (e.g., ghost exposure analysis for users connected to treated removals; edge-cut A/A). Define the expected null. 5) Noncompliance and misclassification: Detection is imperfect. Outline an IV or CUPED/DID approach to recover LATE using removal intensity as an instrument; list assumptions and falsification checks. 6) Ramp and ethics: Define a staged rollout with kill-switch criteria using guardrails (e.g., creator reach drop >1% with p<0.05). Include how you will prevent label leakage in feeds and notifications. 7) Beyond experimentation: If randomization is infeasible in some markets, provide a quasi-experimental backup (synthetic control or staggered DID) and specify exact covariates required from logs. Conclude with a product recommendation you might make if engagement dips short-term but abuse reports drop materially.

Quick Answer: This question evaluates a data scientist's competency in experimental design and causal inference for networked social platforms, including unit and cluster randomization choices, metric selection and exposure windows, power estimation, interference diagnostics, treatment misclassification handling, ethical ramping, and quasi-experimental backups.

Related Interview Questions

  • Measure scheduled posts feature success - Meta (medium)
  • Estimate ads ranking revenue impact - Meta (medium)
  • How should you evaluate unconnected content? - Meta (medium)
  • Should WhatsApp launch group calls? - Meta (medium)
  • How would you grow Meta products? - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Analytics & Experimentation
1
0

Experiment Design: Removing Detected Fake Accounts and Measuring Causal Impact

Context: You are designing an end-to-end experiment on a large, interaction-heavy social platform to remove detected fake accounts (or hide them from some users) and estimate causal impact on real users' experience. Because users are connected, network interference is a first-order concern.

Be specific and address the following:

  1. Experiment unit and randomization
    • Choose and justify between user-level, ego-network cluster, or geography-level randomization.
    • If you choose clusters, describe how you would construct them to minimize cross-treatment contamination while maintaining power.
  2. Primary and guardrail metrics
    • Specify exact metric definitions and windows (exposure-based vs calendar-based). Examples include: comments_per_view, 7-day retention of real accounts, abuse reports per 1K views.
  3. Power and duration
    • Provide a concrete back-of-envelope sample-size calculation assuming:
      • Detectable effect: 0.5% relative change in comments_per_view
      • Baseline mean: 0.12
      • Overdispersion present
      • Intra-cluster correlation (ICC): 0.02
  4. Interference diagnostics
    • Propose two tests to quantify spillovers (e.g., ghost exposure analysis for users connected to treated removals; edge-cut A/A).
    • Define the expected null for each test.
  5. Noncompliance and misclassification
    • Detection of fake accounts is imperfect. Outline an IV or CUPED/DID approach to recover LATE using removal intensity as an instrument, and list assumptions and falsification checks.
  6. Ramp and ethics
    • Define a staged rollout with kill-switch criteria using guardrails (e.g., creator reach drop >1% with p<0.05).
    • Include how you will prevent label leakage in feeds and notifications.
  7. Beyond experimentation
    • If randomization is infeasible in some markets, provide a quasi-experimental backup (synthetic control or staggered DID) and specify exact covariates required from logs.
    • Conclude with a product recommendation if engagement dips short-term but abuse reports drop materially.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Meta•More Data Scientist•Meta Data Scientist•Meta Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.