Analyze A/B test with rigorous diagnostics

Q: Analyze A/B test with rigorous diagnostics

This question evaluates a data scientist's competency in experimental design and rigorous A/B test analysis, including covariate balance checks, primary metric definition and guardrails, intent-to-treat estimation with analytic and bootstrap confidence intervals, variance reduction via CUPED, instrumental-variable estimation for CACE (2SLS), subgroup heterogeneity with multiple-testing control, power/MDE assessment, sequential testing diagnostics, and visualization of treatment effects. Commonly asked in Analytics & Experimentation interviews, it assesses both conceptual understanding of causal inference and statistical diagnostics and practical application skills in implementing robust A/B test analyses and interpreting diagnostic outputs, with the domain focused on applied experimentation and the level of abstraction spanning conceptual understanding and hands-on practical application.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

Loading...

A/B Test Analysis Live Walkthrough (Python)

Context

You are given a user-level randomized experiment dataset experiment.csv with columns:

user_id
variant ∈ {A, B}
assign_ts (UTC timestamp)
saw_treatment (0/1; whether the user actually saw the treatment)
country (categorical)
device (categorical)
pre_metric (pre-experiment baseline metric)
active_minutes_d7
paid_d7 (0/1)
revenue_d7
sessions_d7
crashes_d7

Assumptions:

One row per unique user (if duplicates exist, keep the earliest assign_ts per user).
Randomization occurred at the user level.
Outcomes are 7-day metrics post-assignment.

Tasks

Using Python, do the following:

Verify randomization via covariate balance tests and visualizations.
Define and justify the primary metric and guardrails.
Compute the ITT (intent-to-treat) for the primary metric with 95% CIs using both:
- Analytic normal approximation (CLT) with cluster-robust SE at the user level.
- Bootstrap (stratified by variant).
Apply CUPED using pre_metric and report variance reduction.
Handle noncompliance by estimating CACE via 2SLS (instrument: variant → saw_treatment). Discuss IV assumptions and diagnostics.
Check heterogeneity by country and device with multiple-testing control (e.g., Benjamini–Hochberg).
Assess power and MDE given observed variance and sample size.
Evaluate sequential peeking risk and show how a spending function or alpha-adjusted boundary would change conclusions.
Produce plots (ECDFs, quantile treatment effects, covariate-binned effects) to support findings.
Recommend ship/no-ship and call out the top two residual risks.

Analyze A/B test with rigorous diagnostics

A/B Test Analysis Live Walkthrough (Python)

Context

Tasks

Solution

Comments (0)

Analyze A/B test with rigorous diagnostics

Overview

A/B Test Analysis Live Walkthrough (Python)

Context

Tasks

Solution

Comments (0)