PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Meta

Estimate fake-account prevalence with capture-recapture

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in capture–recapture estimation, estimation of population size with incomplete detections, statistical inference for confidence intervals, understanding of bias from detector dependence, and post-stratification or model-based adjustments.

  • medium
  • Meta
  • Statistics & Math
  • Data Scientist

Estimate fake-account prevalence with capture-recapture

Company: Meta

Role: Data Scientist

Category: Statistics & Math

Difficulty: medium

Interview Round: Onsite

Two independent detectors flag suspected fake accounts on a platform of 50M active accounts. In a week, Detector A flags 12,000, Detector B flags 9,000, with 2,000 overlap. (a) Using the Chapman capture–recapture estimator, estimate total fake accounts and a 95% CI, stating assumptions. (b) How do correlated detectors or coordinated adversaries bias the estimate, and how would you reduce bias (e.g., stratification by region, account age, or activity band)? (c) If detector recall improves for high-activity accounts, propose a post-stratified or model-based adjustment and show the updated estimator at a high level.

Quick Answer: This question evaluates a data scientist's competency in capture–recapture estimation, estimation of population size with incomplete detections, statistical inference for confidence intervals, understanding of bias from detector dependence, and post-stratification or model-based adjustments.

Related Interview Questions

  • Compute probability an account is fake - Meta (easy)
  • Compute Bayes probability for fake accounts - Meta (easy)
  • Compute probabilities for chatbot response quality - Meta (easy)
  • Compute posterior fake probability using Bayes' rule - Meta (medium)
  • Estimate bots and CI from DAU spike - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
2
0
Loading...

Capture–Recapture Estimation with Two Detectors

You are evaluating suspected fake accounts on a platform with 50 million active accounts. In one week:

  • Detector A flags n_A = 12,000 accounts.
  • Detector B flags n_B = 9,000 accounts.
  • Overlap (flagged by both) is m = 2,000 accounts.

Answer the following:

(a) Using the Chapman capture–recapture estimator, estimate the total number of fake accounts and a 95% confidence interval. Clearly state the assumptions required.

(b) Explain how correlated detectors (positively or negatively) or coordinated adversaries would bias the estimate. Propose practical steps to reduce bias (e.g., stratifying by region, account age, or activity band).

(c) Suppose detector recall is higher for high-activity accounts. Propose a post-stratified or model-based adjustment and give the updated estimator at a high level.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Meta•More Data Scientist•Meta Data Scientist•Meta Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.