Experiment Design: Redesigned Onboarding with Network Effects and Weekly Seasonality
Background
You are launching a redesigned onboarding flow for a consumer social app. The redesign is expected to increase Day-7 activation among new users. However, onboarding can induce network effects (e.g., users invite others) and there is known weekly seasonality in traffic and behavior.
Task
Design a rigorous experiment plan that addresses the following:
-
Hypothesis and metrics:
-
State the hypothesis and null/alternative.
-
Specify primary metric(s), guardrail metrics, and network-effect secondary metrics.
-
Provide exact metric definitions, denominators, attribution rules, and time windows.
-
Randomization and exposure:
-
Choose the unit of randomization and exposure (user, household/device, geo, or graph/cluster) and justify your choice given potential interference from invites.
-
Power, MDE, and duration:
-
Provide sample size and power analysis, target MDE, and duration assumptions.
-
Explain how you will account for weekly seasonality (e.g., run for multiples of full weeks and allow Day-7 windows to mature).
-
Variance reduction:
-
Describe techniques such as CUPED with pre-period covariates, stratification, regression adjustment, or geo-matched pairs.
-
SRM (sample ratio mismatch):
-
Define how you will detect SRM and what you will do if you find it.
-
Sequential monitoring:
-
Provide a sequential monitoring and stopping plan (e.g., alpha spending) to avoid p-hacking.
-
Ramp and holdouts:
-
Propose a ramp plan with holdouts and how you will handle novelty and learning effects.
-
Data quality diagnostics:
-
Describe diagnostics for noncompliance, bot/invalid traffic, and triggered vs. assigned populations.
-
Interference/spillovers:
-
Explain how you would detect and mitigate interference (cluster randomization, geo experiments, or switchback) and how you would quantify any bias if you end up using user-level randomization.
-
Decision framework:
-
Describe how you would interpret outcomes if the primary and guardrail metrics disagree, and how you would decide to ship.