Design and critique an abuse-detection ML system

Q: Design and critique an abuse-detection ML system

This question evaluates system-design and production machine learning competencies including large-scale classification versus risk scoring, handling extreme class imbalance and delayed labels, calibration and thresholding under a fixed human-review budget, near-real-time feature engineering, robustness and drift detection, and privacy and fairness trade-offs. It is commonly asked in the Machine Learning domain for Data Scientist roles to test an interviewee's ability to balance statistical objectives, operational constraints and ethical considerations; the category tested is Machine Learning (Trust & Safety) and the level of abstraction spans both conceptual understanding and practical application in production systems, English summary.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Loading...

ML System Design: Abusive Content Detection and Triage (Trust & Safety)

Context: You are designing an ML system to identify and triage abusive content uploads in a Trust & Safety context. Only 0.2% of items are true violations, and human labels arrive with a median delay of 36 hours. The system must operate near real-time and route a fixed number of items to human review daily.

Cover the following, with concrete choices and trade-offs:

Problem Framing

Binary classification vs. risk scoring.
Objective appropriate for extreme class imbalance (e.g., PR-AUC or cost-weighted utility).
Define the positive label precisely given noisy moderation decisions (e.g., prioritize the latest decision within 7 days; handle conflicting labels).

Data and Features

Architecture for near-real-time features (user history aggregates, text/image embeddings, graph signals).
Leakage audits.
Privacy constraints (minimize PII retention; differential privacy or k-anonymity where needed).

Training

Sampling/weighting strategy (e.g., class weights, focal loss, hard negative mining).
Handling delayed labels (label lag queues, exclusion windows).
Calibration (isotonic/Platt).
Validation split that respects time and user leakage.

Thresholding Under Review Budget

Given 2,000,000 daily items and a human review budget of 10,000/day, describe exactly how to pick a threshold t on calibrated scores to maximize expected true violations sent to review subject to budget.
Include the computation using score quantiles from a recent labeled window and how to re-tune t as demand drifts.

Online Evaluation

Guardrail metrics (latency, false positive rate on high-trust creators, geographic fairness).
Interleaving/canary design.
How to measure lift vs baseline heuristics with delayed ground truth.

Robustness and Drift

Detection of covariate/label drift, adversarial adaptation signals.
Periodic re-training policy.
Fail-safe degradations during P0 incidents.

Ethics and Fairness

Define and monitor group-specific error rates; propose mitigation (re-weighting, post-hoc calibration) if disparities exceed thresholds.
Explain escalation when utility vs fairness trade-offs conflict.

Post-Launch Monitoring

Dashboards, alert thresholds, and a rollback plan tied to concrete SLAs.

For each section, provide at least one specific metric and a crisp decision rule you would actually use in production.

Design and critique an abuse-detection ML system

ML System Design: Abusive Content Detection and Triage (Trust & Safety)

Solution

Comments (0)

Design and critique an abuse-detection ML system

Overview

ML System Design: Abusive Content Detection and Triage (Trust & Safety)

Solution

Comments (0)