PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Analytics & Experimentation/Airwallex

Investigate Harassment Surge and Mitigation

Last updated: Jun 15, 2026

Quick Overview

An Airwallex Data Scientist onsite analytics question: a monthly moderation report shows the Harassment violation type spiking, and you must decide whether the surge is real or a measurement artifact, then propose and test mitigations. It probes diagnostic analytics, prevalence-metric design, denominator effects, Simpson's paradox, selection bias, label drift, model calibration, causal inference (ITS/DiD), and experiment design with guardrails.

  • medium
  • Airwallex
  • Analytics & Experimentation
  • Data Scientist

Investigate Harassment Surge and Mitigation

Company: Airwallex

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: medium

Interview Round: Onsite

##### Question You work on content integrity. Your monthly moderation analysis shows that the `Harassment` violation type increased sharply in the most recent month. Assume a post is classified as violating when `probability_violating > 0.5`, and the monthly distribution is based on distinct posts viewed in each month. A post can belong to more than one violation type. As a Data Scientist, work through the following: 1. **Frame the surge precisely.** Distinguish a *volume* increase (more harassment posts) from a *rate* increase (a higher share or prevalence). Which prevalence definitions and denominators would you compare, and over which time windows? 2. **Plausible explanations.** List the most likely causes, covering both real-world and measurement-related ones, including: - a true increase in abusive behavior - traffic-mix changes across surfaces, regions, languages, or creator cohorts - seasonality or external events - coordinated attacks or repeat offenders - policy-definition changes - model-threshold changes or model-version changes - calibration drift in the classifier - data-quality, logging, or backfill issues 3. **Investigate real vs. artifact.** Lay out a concrete investigation plan. Be specific about the prevalence metrics, segments, and additional datasets you would request (e.g. human-review labels, user reports, enforcement logs, model-version metadata). Your statistical reasoning must explicitly address **denominator effects, Simpson's paradox, selection bias, label drift, and model calibration**, and explain how you would validate model-driven metrics against human-reviewed labels. 4. **Propose interventions (if the surge is real).** Recommend product, ranking, policy, operational, and ML solutions, distinguishing short-term containment from longer-term fixes. 5. **Evaluate the mitigation.** Design an experiment or quasi-experiment to test a chosen intervention. Specify the primary metrics, guardrail metrics, unit of randomization (or rollout design), and the main tradeoffs involving false positives, fairness, and user experience.

Quick Answer: An Airwallex Data Scientist onsite analytics question: a monthly moderation report shows the Harassment violation type spiking, and you must decide whether the surge is real or a measurement artifact, then propose and test mitigations. It probes diagnostic analytics, prevalence-metric design, denominator effects, Simpson's paradox, selection bias, label drift, model calibration, causal inference (ITS/DiD), and experiment design with guardrails.

Related Interview Questions

  • How would you test billboard effectiveness? - Airwallex (medium)
  • How would you measure billboard impact? - Airwallex (medium)
  • How would you test swapping two CTA buttons? - Airwallex (medium)
  • Diagnose a dip in approval/conversion rate - Airwallex (medium)
  • Diagnose conversion-rate time series and CTA swap - Airwallex (easy)
Airwallex logo
Airwallex
Nov 14, 2025, 12:00 AM
Data Scientist
Onsite
Analytics & Experimentation
3
0
Question

You work on content integrity. Your monthly moderation analysis shows that the Harassment violation type increased sharply in the most recent month. Assume a post is classified as violating when probability_violating > 0.5, and the monthly distribution is based on distinct posts viewed in each month. A post can belong to more than one violation type.

As a Data Scientist, work through the following:

  1. Frame the surge precisely. Distinguish a volume increase (more harassment posts) from a rate increase (a higher share or prevalence). Which prevalence definitions and denominators would you compare, and over which time windows?
  2. Plausible explanations. List the most likely causes, covering both real-world and measurement-related ones, including:
    • a true increase in abusive behavior
    • traffic-mix changes across surfaces, regions, languages, or creator cohorts
    • seasonality or external events
    • coordinated attacks or repeat offenders
    • policy-definition changes
    • model-threshold changes or model-version changes
    • calibration drift in the classifier
    • data-quality, logging, or backfill issues
  3. Investigate real vs. artifact. Lay out a concrete investigation plan. Be specific about the prevalence metrics, segments, and additional datasets you would request (e.g. human-review labels, user reports, enforcement logs, model-version metadata). Your statistical reasoning must explicitly address denominator effects, Simpson's paradox, selection bias, label drift, and model calibration , and explain how you would validate model-driven metrics against human-reviewed labels.
  4. Propose interventions (if the surge is real). Recommend product, ranking, policy, operational, and ML solutions, distinguishing short-term containment from longer-term fixes.
  5. Evaluate the mitigation. Design an experiment or quasi-experiment to test a chosen intervention. Specify the primary metrics, guardrail metrics, unit of randomization (or rollout design), and the main tradeoffs involving false positives, fairness, and user experience.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Airwallex•More Data Scientist•Airwallex Data Scientist•Airwallex Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.