Causal Diagnosis: Do More Ad Page Visits Cause More Reports?
Context
You observe a positive correlation between the number of ad page visits and the probability that an ad is reported as "bad." Determine whether this is a causal exposure effect (more visits cause more reports) or is driven by selection/exposure confounding (e.g., certain traffic sources, targeting, or instrumentation issues).
Assume you have ad–day–geo–hour level data with counts of visits and reports, plus metadata (ad ID, traffic source, geo, device, time, user-level frequency where available).
Deliverables
-
Hypotheses: Provide at least three concrete, falsifiable hypotheses (e.g., pure exposure effect, targeting bias to sensitive cohorts, low-quality ads drive both attention and reporting, instrumentation effects). For each, state a falsifiable prediction.
-
Diagnostic plan: Specify exact metrics and cuts, such as per-visit report rate vs. visit volume buckets; within-ad fixed-effects trends; cohorting by traffic source; funnel-based conditional probabilities.
-
Statistical model: Propose an ad–day model (logistic or Poisson/negative binomial) that controls for exposure via an offset log(visits), includes ad fixed effects and hour-of-day/geo controls. Explain coefficient interpretation and how you will check overdispersion and multicollinearity.
-
Quasi-experimental approach: Propose at least one (e.g., holdout geos, randomized delivery caps, or an IV using exogenous traffic shocks like site outages) and describe how you’d run a parallel A/A to validate.
-
Criteria for action: Define thresholds for an operational alert (e.g., ≥20% lift in per-visit report rate with p<0.01 after multiple-testing control) and recommend product/policy levers under each outcome.