Scenario
You are on the analytics team at a large social media platform. A positive correlation has been observed between the number of page visits to an advertisement and the number of user reports labeling it as bad content. An investigation is needed to understand what drives this pattern and to define how to flag top bad advertisers.
Task 1: Define 'top bad advertiser' in SQL
-
Propose clear metrics and conditions to identify advertisers with high incidents of reported ads.
-
Include rate-based metrics (not just counts), minimum volume thresholds, deduplication rules, and any severity weighting.
-
Provide example SQL that aggregates events and ranks advertisers accordingly.
Task 2: Investigate the visit–report correlation
-
Propose a hypothesis for why ad page visits are positively correlated with reports.
-
Outline a plan to validate or refute the hypothesis, calling out correlation vs. causation.
-
Describe the data fields and breakdowns you would use (e.g., ad content, engagement types, user demographics, placement, time).
-
Include example SQL to compute and analyze these patterns.
Hints
-
Do not conflate correlation with causation; control for exposure and composition of traffic.
-
Consider confounders: ad category, creative type, placement, region, device, time, and user segments.
-
Think about unique reporters vs. total reports, verification by moderation, and report timing relative to visits.