Causal Impact of a New Onboarding Flow Launched in Texas and Florida
Context: A new onboarding flow was launched on 2025-07-15 only in Texas (TX) and Florida (FL). A line chart shows a 3% nationwide DAU dip about a week later. You are asked to design a defensible causal analysis and provide precise computations and recommendations.
Assumptions for clarity:
-
Outcome is daily active users (DAU) at the state level.
-
Pre period: 2025-06-15 to 2025-07-14. Post period: 2025-07-15 to 2025-08-14.
-
Control pool is all other US states (excluding TX and FL).
Tasks
-
Causal question and estimand
-
State the causal question: Did the onboarding feature cause a change in DAU in the treated states (TX, FL)?
-
Specify the estimand clearly and write the difference-in-differences (DiD) formula explicitly (in both level and relative/log terms).
-
Compute DiD using provided daily averages
-
Summary (daily averages):
-
Texas: pre = 500k, post = 515k
-
Florida: pre = 300k, post = 303k
-
Control pool (other states): pre = 2,000k, post = 2,060k
-
Compute DiD for each treated state and for the combined treated group (population-weighted by pre-period DAU). Interpret the sign and magnitude.
-
Segmented (interrupted) time-series regression
-
The chart shows weekend troughs and a visible break on 2025-07-20.
-
Outline a segmented regression design to estimate immediate level change and slope change for treated vs. control.
-
Provide the regression equation and describe how you would cluster standard errors.
-
Guardrail metrics
-
Propose at least three guardrails (e.g., crash rate, p95 latency, payment decline rate), define decision thresholds, and specify which are one-sided vs. two-sided.
-
Power and duration for DiD
-
Given: average daily DAU in treated combined ≈ 815k, minimum detectable effect (MDE) = 0.5% relative on DAU, alpha = 0.05, power = 0.8.
-
Estimate required days under a parallel-trends DiD. State assumptions and whether CUPED or synthetic controls would reduce duration.
-
Parallel trends and spillovers
-
Around 2025-07-27, there was a marketing campaign in California causing noise. Explain how to validate the parallel trends assumption and how to choose a donor pool or weights to mitigate spillovers.
-
Recommendation and additional data
-
Provide a brief go/no-go recommendation and list the exact additional data you would request to de-risk the decision.