PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Meta

Compute fraud probabilities with Bayes and Binomial

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's understanding of probabilistic modeling and statistical decision-making, focusing on the Binomial distribution for session-level events and Bayes' theorem for posterior probabilities in a fraud-detection setting.

  • medium
  • Meta
  • Statistics & Math
  • Data Scientist

Compute fraud probabilities with Bayes and Binomial

Company: Meta

Role: Data Scientist

Category: Statistics & Math

Difficulty: medium

Interview Round: Onsite

An online platform classifies accounts as fake or authentic. Prior: 3% of accounts are fake. Signals: For each account in the last week there are n = 5 independent sessions. In each session, a “suspicious action” occurs with probability p_F = 0.5 if the account is fake and p_A = 0.05 if authentic. The rule flags an account if it has at least k suspicious sessions. (a) For k = 2, compute TPR = P(flag | fake) and FPR = P(flag | authentic) using the Binomial distribution. Show formulas and numeric values. (b) Using Bayes’ Theorem, compute PPV = P(fake | flag) and NPV = P(authentic | not flagged) for k = 2. (c) Now a manual review is applied only to flagged accounts. The reviewer independently has sensitivity 0.90 and specificity 0.98. An account is actioned only if both the rule flags it and the reviewer says “fake.” Compute the new overall TPR and FPR, and the revised PPV. (d) For a population of 1,000,000 accounts, compute expected counts of true positives, false positives, true negatives, and false negatives under the process in (c). (e) For k ∈ {1,2,3,4,5}, which k maximizes F1 score on the prior above without the manual review step? Outline the computation and provide the numeric choice. Discuss how the optimal k would change if the base fake rate rose to 10%. (f) Identify which errors in (a)–(e) correspond to Type I vs. Type II errors in this context.

Quick Answer: This question evaluates a candidate's understanding of probabilistic modeling and statistical decision-making, focusing on the Binomial distribution for session-level events and Bayes' theorem for posterior probabilities in a fraud-detection setting.

Related Interview Questions

  • Compute probability an account is fake - Meta (easy)
  • Compute Bayes probability for fake accounts - Meta (easy)
  • Compute probabilities for chatbot response quality - Meta (easy)
  • Compute posterior fake probability using Bayes' rule - Meta (medium)
  • Estimate bots and CI from DAU spike - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
8
0

Fake-Account Detection with Binomial Sessions and Bayes Updating

You are evaluating a rules-based detector for fake accounts on an online platform. Each account had n = 5 independent sessions last week. In each session, a "suspicious action" happens with probability p_F = 0.5 if the account is fake and p_A = 0.05 if authentic. The detector flags an account if it has at least k suspicious sessions. The prior fake rate is 3%.

Assumptions:

  • Sessions are independent given account type (fake vs authentic).
  • In part (c), the manual reviewer’s decision is independent of the rule conditional on the true label and is only applied to flagged accounts.

Answer the following:

(a) For k = 2, compute TPR = P(flag | fake) and FPR = P(flag | authentic) using the Binomial distribution. Show formulas and numeric values.

(b) Using Bayes’ Theorem, compute PPV = P(fake | flag) and NPV = P(authentic | not flagged) for k = 2.

(c) Now a manual review is applied only to flagged accounts. The reviewer independently has sensitivity 0.90 and specificity 0.98. An account is actioned only if both the rule flags it and the reviewer says “fake.” Compute the new overall TPR and FPR, and the revised PPV.

(d) For a population of 1,000,000 accounts, compute expected counts of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) under the process in (c).

(e) For k ∈ {1, 2, 3, 4, 5}, which k maximizes the F1 score on the prior above without the manual review step? Outline the computation and provide the numeric choice. Discuss how the optimal k would change if the base fake rate rose to 10%.

(f) Identify which errors in (a)–(e) correspond to Type I vs. Type II errors in this context.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Meta•More Data Scientist•Meta Data Scientist•Meta Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.