System Design: Human-in-the-Loop Review Subsystem
Context
You are designing a human-in-the-loop (HITL) review subsystem for a large-scale safety platform that moderates user-generated content (UGC) across text, images, and audio (including live voice). Automated detectors (ML models and rules) generate “detections” with metadata (content IDs, model type, confidence, policy category, timestamps). Some detections require immediate enforcement; others need human review for accuracy, context, or policy interpretation.
Requirements
Design and explain the end-to-end HITL subsystem, covering:
-
How review tasks are generated from detections (schema, deduplication, aggregation, idempotency, sampling).
-
Triage into queues by severity with prioritization and dynamic aging.
-
Reviewer UI requirements and ergonomics, including audio-specific needs.
-
SLAs/SLOs per queue and breach handling.
-
Sampling and double-blind consensus to ensure quality; inter-rater agreement.
-
Gold-standard audits (honeypots), reviewer calibration, and performance scoring.
-
Escalation paths and requeueing logic for ambiguous or time-sensitive items.
-
Access control and privacy for sensitive audio and PII.
-
Audit logs and tamper-evident trail.
-
Feedback loop where reviewer decisions update model thresholds, rules, and training data.
-
Capacity planning for reviewers and backlog control during surges.
State reasonable assumptions where necessary and be explicit about trade-offs.