Bayesian inference for abuse detection with error control
Setup
A platform runs a binary classifier that flags users who might be bad actors. Let:
-
p = prior probability a random user is a bad actor (prevalence)
-
TPR (recall) = P(flag | bad)
-
FPR = P(flag | good)
Tasks
-
Derive the posterior probability that a flagged user is truly a bad actor, P(bad | flag), in terms of p, TPR, and FPR.
-
Define Type I and Type II errors in this context and explain their business impact.
-
If 1% of 10,000,000 users are truly bad, the classifier has 95% recall and 2% false-positive rate, compute:
-
The expected number of bad actors caught (true positives).
-
The expected number of good users incorrectly flagged (false positives).
Hints: Apply Bayes’ theorem, build a 2×2 confusion matrix, compute expected counts from prevalence and error rates.