Scenario
Classifying reviewers as lazy or careful with limited labels
Context (completed)
You are auditing a pool of reviewers who can be either:
-
Lazy (L): lower accuracy
-
Careful (C): higher accuracy
Assume a known prior mixture π = P(L) and per-review accuracies a_L and a_C with a_C > a_L. For each reviewer, you observe their performance on n gold items (with known ground truth), yielding k correct out of n.
Question
-
Use Bayes' theorem to propose the classification rule that predicts "lazy" when P(L | data) > 0.5.
-
Given the true mixture and review accuracies, derive the false-positive and false-negative rates of this rule.
-
If every reviewer is required to write the same large number of reviews (e.g., 100), how will Type I and Type II error rates change?
Hints: Treat reviewer type as the latent class and use a Bayesian optimal decision boundary; error rates shrink as review count grows.