This question evaluates a data scientist's competency in product analytics and experimentation, covering precise metric definition and validation, instrumentation and event quality checks, rapid triage of adoption regressions, hypothesis generation, experiment design, and causal inference.
Over the last 4 calendar weeks, enterprise adoption rate has fallen from 38% to 31%. Adoption rate is defined as weekly active unique users divided by total enabled users. You are the on-call analyst.
(a) Precisely define and validate "adoption rate". Cover event definitions, de-duplication, bot filtering, cross-device identity, time zones, and attribution windows. Describe how you would backfill and reconcile with historical dashboards.
(b) Lay out a 24-hour triage plan to distinguish instrumentation issues, seasonality, and true behavioral change. Include concrete queries/checks, holdout benchmarks, and sanity ratios.
(c) Propose at least three falsifiable hypotheses segmented by customer tier, geography, client platform (web/iOS/Android), and feature usage (e.g., recording, meeting size). For each, state the evidence that would support or refute it.
(d) Design experiments to reverse the decline (e.g., onboarding nudge, performance fix, reminder). For each, specify primary/secondary metrics, guardrails, minimal detectable effect (MDE), power, sample size, duration, and a ramp plan. Explain how you would choose variants under limited traffic.
(e) Show how to separate causal impact from external shocks (holidays, competitor launches) using difference-in-differences, synthetic control, or interrupted time series. State the assumptions and how to test them.
(f) Identify risks such as metric gaming, delayed conversions, and Simpson’s paradox. Describe monitoring and drill-downs that prevent false wins and ensure reproducibility.
Login required