Measuring Abuse in Friend-Requests: Bayes, Identification, and Precision
Scenario
A social-network platform wants to measure and control abuse. Five percent of users are classified as "bad" and, on average, each bad user sends 10× as many friend-requests as a good user.
Tasks
-
Compute the probability that a randomly selected friend-request came from a bad user.
-
Using only existing event logs, propose a method to identify the likely bad users.
-
With additional features (e.g., request timing, acceptance rate), write an expression for P(good | request features) using Bayes' theorem.
-
If you must shrink the confidence interval (CI) of that probability estimate to one-tenth its current width, what changes in data collection or analysis would you make?
Hints
-
Apply Bayes’ rule and reason about class imbalance and different activity rates.
-
Use unsupervised/weakly supervised signals from logs; normalize for exposure (tenure/active days).
-
CI width typically shrinks as 1/sqrt(n); consider variance-reduction techniques.