##### Scenario Classifying reviewers as lazy or careful with limited labels ##### Question Propose a classification rule based on P(lazy | data) > 0.5 using Bayes’ theorem. Given the true mixture and review accuracies, derive the false-positive and false-negative rates of this rule. If every reviewer is required to write the same large number of reviews (e.g., 100), how will type I and type II error rates change? ##### Hints Treat reviewer type as the latent class and use a Bayesian optimal decision boundary; error rates shrink as review count grows.

Meta machine learning and Bayesian classification prompt on identifying lazy reviewers from gold-task accuracy, using posterior odds, binomial likelihoods, false-positive and false-negative rates, and large-sample behavior.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a medium difficulty Machine Learning question, commonly asked during Onsite rounds at Meta.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Classify Reviewers With Bayesian Probability

You are auditing reviewers who may be lazy or careful. Each reviewer completes n gold-standard review tasks with known ground truth, and you observe k correct reviews.

Assume:

P(Lazy) = pi and P(Careful) = 1 - pi .
Lazy reviewers have per-review accuracy a_L .
Careful reviewers have per-review accuracy a_C , where a_C > a_L .
Review outcomes are independent conditional on reviewer type.

Constraints & Assumptions

Use Bayes' theorem to derive the posterior probability of being lazy.
Propose a rule that classifies a reviewer as lazy when P(Lazy | data) > 0.5 .
Derive false-positive and false-negative rates under the true model.
Explain how the errors change as each reviewer completes many gold tasks.

Clarifying Questions to Ask

Are pi , a_L , and a_C known, estimated, or uncertain?
Are the gold tasks representative of real review difficulty?
Are the costs of false positives and false negatives equal?
Is n the same for every reviewer?

What a Strong Answer Covers

Model K | Lazy ~ Binomial(n, a_L) and K | Careful ~ Binomial(n, a_C) .
Posterior odds equal prior odds times the likelihood ratio.
Classify as lazy when posterior odds exceed 1, equivalently when a log-likelihood-ratio threshold is crossed.
Because a_C > a_L , low k values are more evidence of being lazy.
False positive rate: P(classify Lazy | Careful) , computed over the binomial distribution under a_C .
False negative rate: P(classify Careful | Lazy) , computed over the binomial distribution under a_L .
With large n , the two binomial distributions separate, so both Type I and Type II errors usually shrink if assumptions are correct.
Practical caveats around task difficulty, correlated errors, estimated parameters, calibration, and unequal costs.

Follow-up Questions

How would you change the rule if false positives are much more costly?
What if reviewer accuracies vary continuously rather than having two types?
How would you estimate a_L and a_C from data?

Classify Reviewers With Bayesian Probability

You are auditing reviewers who may be lazy or careful. Each reviewer completes n gold-standard review tasks with known ground truth, and you observe k correct reviews.

Assume:

P(Lazy) = pi and P(Careful) = 1 - pi .

Lazy reviewers have per-review accuracy a_L .

Careful reviewers have per-review accuracy a_C , where a_C > a_L .

Review outcomes are independent conditional on reviewer type.

Constraints & Assumptions

Use Bayes' theorem to derive the posterior probability of being lazy.

Propose a rule that classifies a reviewer as lazy when P(Lazy | data) > 0.5 .

Derive false-positive and false-negative rates under the true model.

Explain how the errors change as each reviewer completes many gold tasks.

Clarifying Questions to Ask

Are pi , a_L , and a_C known, estimated, or uncertain?

Are the gold tasks representative of real review difficulty?

Are the costs of false positives and false negatives equal?

Is n the same for every reviewer?

What a Strong Answer Covers

Model K | Lazy ~ Binomial(n, a_L) and K | Careful ~ Binomial(n, a_C) .

Posterior odds equal prior odds times the likelihood ratio.

Classify as lazy when posterior odds exceed 1, equivalently when a log-likelihood-ratio threshold is crossed.

Because a_C > a_L , low k values are more evidence of being lazy.

False positive rate: P(classify Lazy | Careful) , computed over the binomial distribution under a_C .

False negative rate: P(classify Careful | Lazy) , computed over the binomial distribution under a_L .

With large n , the two binomial distributions separate, so both Type I and Type II errors usually shrink if assumptions are correct.

Practical caveats around task difficulty, correlated errors, estimated parameters, calibration, and unequal costs.

Follow-up Questions

How would you change the rule if false positives are much more costly?

What if reviewer accuracies vary continuously rather than having two types?

How would you estimate a_L and a_C from data?

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Quick Overview

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Classify Reviewers With Bayesian Probability

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Write your answer

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Quick Overview

Classify Reviewers Using Bayesian Probability for Accuracy Analysis

Classify Reviewers With Bayesian Probability

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Write your answer