PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Meta

Choose metrics for fake-user classifier

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to select and interpret evaluation metrics for severely imbalanced classification problems, perform thresholding and probability calibration, and quantify capacity- and cost-constrained trade-offs between precision and recall.

  • medium
  • Meta
  • Machine Learning
  • Data Scientist

Choose metrics for fake-user classifier

Company: Meta

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Onsite

You suspect many fake users are inflating comment counts. You will build a classifier to flag fake accounts for review. Propose and justify evaluation metrics and thresholding under two operational constraints, and do the required calculations: Context: 10,000,000 daily active users; true fake rate ~1%; review capacity 50,000 accounts/day. Two candidate models produce the following validation metrics at chosen thresholds: - Model A: precision = 0.60, recall = 0.20 at threshold τA. - Model B: precision = 0.20, recall = 0.80 at threshold τB. Tasks: 1) Choose offline metrics: Explain when to prefer PR-AUC over ROC-AUC. Specify primary metrics, including precision@K, recall@K, PR-AUC, calibrated Brier score, and cost-weighted utility. Justify choices given severe class imbalance and limited review capacity. 2) Capacity feasibility: For each model at its given threshold, compute expected true positives and false positives per day if applied to the full population. State whether each fits within the 50,000/day capacity and, if not, how you would set K or raise the threshold to meet capacity while maximizing expected true positives. 3) Business trade-offs: Given costs FP = $2 (review cost) and FN = $100 (missed abuse), select an Fβ score with appropriate β and justify. Show the expected daily cost under Model A and Model B at their current thresholds. 4) Thresholding and calibration: Describe how you would choose τ via a precision-recall curve subject to precision ≥ 0.7 or FP ≤ 20,000/day; explain how you would use probability calibration (Platt or isotonic) before thresholding. 5) Validation protocol: Describe time-based cross-validation to avoid leakage, offline-to-online guardrails (e.g., CUPED or AA-test), and which online metrics (report precision among reviewed, review throughput, and downstream abuse reduction) you would monitor.

Quick Answer: This question evaluates a candidate's ability to select and interpret evaluation metrics for severely imbalanced classification problems, perform thresholding and probability calibration, and quantify capacity- and cost-constrained trade-offs between precision and recall.

Related Interview Questions

  • Implement 1NN Embeddings and Forward Pass - Meta (hard)
  • Design and evaluate an ads ranking algorithm - Meta (easy)
  • How would you design a Shop Ads ranking algorithm? - Meta (easy)
  • Derive Linear Regression Solution - Meta (medium)
  • Explain key ML metrics and techniques - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Machine Learning
2
0

Classifying Fake Accounts: Metrics, Capacity, Thresholding, and Validation

Context

  • Population: 10,000,000 daily active users (DAU)
  • True fake rate (prevalence): ≈ 1% ⇒ ~100,000 fakes/day
  • Human review capacity: 50,000 accounts/day
  • Two candidate models at chosen thresholds τA and τB (from validation):
    • Model A: precision = 0.60, recall = 0.20
    • Model B: precision = 0.20, recall = 0.80

You will propose evaluation metrics and thresholding strategies under operational constraints and perform the requested calculations.

Tasks

  1. Offline metrics selection: Explain when to prefer PR-AUC over ROC-AUC. Specify primary offline metrics, including precision@K, recall@K, PR-AUC, calibrated Brier score, and cost-weighted utility. Justify these given severe class imbalance and limited review capacity.
  2. Capacity feasibility: For each model at its given threshold, compute expected true positives (TP) and false positives (FP) per day if applied to the full population. State whether each fits within the 50,000/day review capacity. If not, explain how to set K (top-K review) or raise the threshold to meet capacity while maximizing expected true positives.
  3. Business trade-offs and Fβ: Given costs FP = 2(reviewcost)andFN=2 (review cost) and FN = 2(reviewcost)andFN= 100 (missed abuse), select an Fβ score with an appropriate β and justify your choice. Compute the expected daily cost under Model A and Model B at their current thresholds.
  4. Thresholding and calibration: Describe how to choose τ via a precision–recall (PR) curve under the constraints precision ≥ 0.7 or FP ≤ 20,000/day. Explain how you would apply probability calibration (e.g., Platt scaling or isotonic regression) before thresholding and why this matters.
  5. Validation protocol: Describe a time-based cross-validation scheme to avoid leakage, offline-to-online guardrails (e.g., CUPED or an AA-test), and the online metrics you would monitor (e.g., precision among reviewed, review throughput, downstream abuse reduction).

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Meta•More Data Scientist•Meta Data Scientist•Meta Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.