Investigate Harassment Surge and Mitigation
Company: Airwallex
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: medium
Interview Round: Onsite
##### Question
You work on content integrity. Your monthly moderation analysis shows that the `Harassment` violation type increased sharply in the most recent month. Assume a post is classified as violating when `probability_violating > 0.5`, and the monthly distribution is based on distinct posts viewed in each month. A post can belong to more than one violation type.
As a Data Scientist, work through the following:
1. **Frame the surge precisely.** Distinguish a *volume* increase (more harassment posts) from a *rate* increase (a higher share or prevalence). Which prevalence definitions and denominators would you compare, and over which time windows?
2. **Plausible explanations.** List the most likely causes, covering both real-world and measurement-related ones, including:
- a true increase in abusive behavior
- traffic-mix changes across surfaces, regions, languages, or creator cohorts
- seasonality or external events
- coordinated attacks or repeat offenders
- policy-definition changes
- model-threshold changes or model-version changes
- calibration drift in the classifier
- data-quality, logging, or backfill issues
3. **Investigate real vs. artifact.** Lay out a concrete investigation plan. Be specific about the prevalence metrics, segments, and additional datasets you would request (e.g. human-review labels, user reports, enforcement logs, model-version metadata). Your statistical reasoning must explicitly address **denominator effects, Simpson's paradox, selection bias, label drift, and model calibration**, and explain how you would validate model-driven metrics against human-reviewed labels.
4. **Propose interventions (if the surge is real).** Recommend product, ranking, policy, operational, and ML solutions, distinguishing short-term containment from longer-term fixes.
5. **Evaluate the mitigation.** Design an experiment or quasi-experiment to test a chosen intervention. Specify the primary metrics, guardrail metrics, unit of randomization (or rollout design), and the main tradeoffs involving false positives, fairness, and user experience.
Quick Answer: An Airwallex Data Scientist onsite analytics question: a monthly moderation report shows the Harassment violation type spiking, and you must decide whether the surge is real or a measurement artifact, then propose and test mitigations. It probes diagnostic analytics, prevalence-metric design, denominator effects, Simpson's paradox, selection bias, label drift, model calibration, causal inference (ITS/DiD), and experiment design with guardrails.