Compute fraud probabilities with Bayes and Binomial

Q: Compute fraud probabilities with Bayes and Binomial

This question evaluates a candidate's understanding of probabilistic modeling and statistical decision-making, focusing on the Binomial distribution for session-level events and Bayes' theorem for posterior probabilities in a fraud-detection setting.

Q: How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

Question

Fake-Account Detection with Binomial Sessions and Bayes Updating

You are evaluating a rules-based detector for fake accounts on an online platform. Each account had n = 5 independent sessions last week. In each session, a "suspicious action" happens with probability p_F = 0.5 if the account is fake and p_A = 0.05 if authentic. The detector flags an account if it has at least k suspicious sessions. The prior fake rate is 3%.

Assumptions:

Sessions are independent given account type (fake vs authentic).
In part (c), the manual reviewer’s decision is independent of the rule conditional on the true label and is only applied to flagged accounts.

Answer the following:

(a) For k = 2, compute TPR = P(flag | fake) and FPR = P(flag | authentic) using the Binomial distribution. Show formulas and numeric values.

(b) Using Bayes’ Theorem, compute PPV = P(fake | flag) and NPV = P(authentic | not flagged) for k = 2.

(c) Now a manual review is applied only to flagged accounts. The reviewer independently has sensitivity 0.90 and specificity 0.98. An account is actioned only if both the rule flags it and the reviewer says “fake.” Compute the new overall TPR and FPR, and the revised PPV.

(d) For a population of 1,000,000 accounts, compute expected counts of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) under the process in (c).

(e) For k ∈ {1, 2, 3, 4, 5}, which k maximizes the F1 score on the prior above without the manual review step? Outline the computation and provide the numeric choice. Discuss how the optimal k would change if the base fake rate rose to 10%.

(f) Identify which errors in (a)–(e) correspond to Type I vs. Type II errors in this context.

Compute fraud probabilities with Bayes and Binomial

Fake-Account Detection with Binomial Sessions and Bayes Updating

Solution

Comments (0)

Compute fraud probabilities with Bayes and Binomial

Overview

Fake-Account Detection with Binomial Sessions and Bayes Updating

Solution

Comments (0)