A product team at a large software company launches a new feature intended to improve user activation and downstream retention. You are asked to evaluate whether the feature is successful.
-
Define an appropriate primary metric, secondary metrics, and guardrail metrics. Be explicit about tradeoffs between short-term engagement metrics and longer-term business metrics.
-
Explain how you would design a standard randomized A/B test if randomization were possible, including the unit of randomization, success criteria, power or MDE considerations, and common validity checks.
-
Now assume a true randomized experiment is not feasible because the feature has already been partially rolled out, or legal or operational constraints prevent random assignment. Describe several counterfactual estimation approaches you could use instead, such as difference-in-differences, matching or propensity-score methods, synthetic control, regression discontinuity, or instrumental variables. For each method, explain the key assumptions and major sources of bias.
-
Suppose the core product metric suddenly drops on one specific day after launch. Describe how you would determine whether this is a real causal product effect versus a logging issue, data pipeline problem, traffic mix shift, seasonality, or an external event.
Your answer should discuss confounding, selection bias, interference, Simpson's paradox, and how you would communicate uncertainty to stakeholders.