PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Meta

Design bot detection and evaluate trade-offs

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in designing operational machine-learning systems for bot detection, covering labeling and weak supervision, temporal and graph-based feature engineering, model choice and calibration, cost-based thresholding, adversarial robustness, and online monitoring and safety nets.

  • hard
  • Meta
  • Machine Learning
  • Data Scientist

Design bot detection and evaluate trade-offs

Company: Meta

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

You must design and evaluate a bot-detection system for comment activity. Address: 1) Labeling strategy with minimal ground truth: propose weak-supervision heuristics, manual review sampling plans, and how you’d de-bias labels given extreme class imbalance (e.g., <0.5% bots). 2) Features across time scales: per-session burstiness, inter-comment intervals, entropy of targets, language signals, graph-based reciprocity; specify which must be real-time vs. batch. 3) Model choice and calibration: compare linear, tree ensembles, and sequence models; how to calibrate posteriors (Platt/Isotonic) and monitor calibration drift. 4) Thresholding by cost: define costs for FP (blocking a human) vs FN (missing a bot) and pick an operating point using precision-recall curves; compute expected blocked-human-minutes at a chosen threshold given example rates. 5) Adversarial robustness: features least gameable, canaries, and drift detection. 6) Online safety net: shadow mode, backfill re-scoring, human review queues. 7) Evaluation: offline PR-AUC/recall@high-precision; online guardrails (reports, creator retention, comment latency). 8) If FP becomes high, trace the root cause with error analysis and propose a rollback/ramp strategy.

Quick Answer: This question evaluates a data scientist's competency in designing operational machine-learning systems for bot detection, covering labeling and weak supervision, temporal and graph-based feature engineering, model choice and calibration, cost-based thresholding, adversarial robustness, and online monitoring and safety nets.

Related Interview Questions

  • Design and evaluate an ads ranking algorithm - Meta (easy)
  • How would you design a Shop Ads ranking algorithm? - Meta (easy)
  • Derive Linear Regression Solution - Meta (medium)
  • Explain key ML metrics and techniques - Meta (medium)
  • Design an ad recommendation ranking approach - Meta (easy)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Machine Learning
1
0

Bot-Detection System Design for Comment Activity

Context

You are designing and evaluating a machine learning system to detect automated (bot) comment activity on a large-scale social platform. Bots are rare (e.g., <0.5% of comments) and adversarial. Your solution should balance safety (blocking bots) and user experience (minimizing false positives on humans), and it must work both offline and online.

Tasks

  1. Labeling strategy with minimal ground truth
  • Propose weak-supervision heuristics.
  • Define a manual review sampling plan.
  • Explain how to de-bias labels given extreme class imbalance (<0.5% bots).
  1. Features across time scales
  • Specify features: session burstiness, inter-comment intervals, entropy/diversity of targets, language signals, graph-based reciprocity, account/device/network signals.
  • Indicate which features must be real-time vs. batch.
  1. Model choice and calibration
  • Compare linear models, tree ensembles, and sequence models.
  • Describe how to calibrate posteriors (Platt scaling, Isotonic) and how to monitor calibration drift.
  1. Thresholding by cost
  • Define costs for FP (blocking a human) vs FN (missing a bot).
  • Choose an operating point using precision–recall curves.
  • Compute expected blocked-human-minutes at a chosen threshold given example rates.
  1. Adversarial robustness
  • Identify features least gameable, propose canaries, and drift detection.
  1. Online safety net
  • Outline shadow mode, backfill re-scoring, and human review queues.
  1. Evaluation
  • Offline: PR-AUC, recall at high precision, slice analysis.
  • Online guardrails: abuse reports, creator retention, comment latency.
  1. If FP becomes high
  • Trace root cause with error analysis and propose a rollback/ramp strategy.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Meta•More Data Scientist•Meta Data Scientist•Meta Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.