How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a medium difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Chime.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Chime during technical interviews.

Design an A/B launch amid marketing confounds

Company: Chime

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: medium

Interview Round: Technical Screen

You’re running a virtual launch (soft roll-out) of a new fitness tracker product to US+CA users from 2025-08-10 to 2025-08-24 with a 50/50 user-level split (Control vs Variant B). A coordinated marketing push (email + paid + influencers) overlaps the test week, causing contamination and uneven exposure. Data quality quirks emerged: (a) sample ratio is 52:48, not 50:50; (b) purchase events on iOS were dropped for the first 48 hours (2025-08-10 to 2025-08-11); (c) a bug on 2025-08-18 caused an unusual spike in refunds. Pre-period baselines: DAU in eligible geos ≈ 200k; 14-day purchase conversion = 6%; ARPU = $2.20; refund rate = 3% of revenue. Marketing platform logs include user-level ad impressions and email sends. Design a rigorous analysis plan and decision framework that addresses the messy data and marketing confounds: 1) Randomization & exposure: What unit (user, device, geo, or hybrid) and exposure rule would you choose to minimize contamination and noncompliance? How would you handle users who see ads but never get randomized, or who cross over variants across platforms? 2) Metrics: Define a primary success metric and at least 3 guardrails (e.g., refund rate, complaint rate, latency, churn). Specify how each is computed, including windows (e.g., 14-day from first exposure) and exclusion rules. 3) Validity checks: Describe specific diagnostics for SRM, missing instrumentation, novelty effects, and day-of-week seasonality. For each, state the statistical test or threshold you’ll use and what actions you’d take if it fails. 4) Bias mitigation: Propose a concrete approach to adjust for the concurrent marketing push (e.g., geo diff-in-diff with ad intensity as a covariate, CUPED with pre-period spend or engagement, inverse propensity weighting using ad impression propensity). Justify trade-offs among these methods. 5) Power & duration: With baseline 6% conversion, 50/50 split, α=0.05 two-sided, 80% power, and 14-day conversion window, compute the minimum detectable relative lift if you can expose ≈ 2.8M eligible users over the test (assume independence and a binomial variance). Is the test adequately powered? If not, propose changes. 6) Decision under messiness: Suppose after your adjustments the estimated lift in 14-day conversion is +3.5% (95% CI: −0.5%, +7.5%), ARPU is +1.2%, and refund rate increases by +1.1pp. Would you recommend launch, guardrail-triggered rollback, or extended test? State the exact thresholds that drive your decision and how you’d communicate the trade-offs to marketing and product.

Quick Answer: This question evaluates a candidate's competency in experimental design, causal inference, metric definition, statistical diagnostics, and decision-making under data-quality and marketing confounds within the Analytics & Experimentation domain.

Randomization & exposure: What unit (user, device, geo, or hybrid) and exposure rule would you choose to minimize contamination and noncompliance? How would you handle users who see ads but never get randomized, or who cross over variants across platforms?
Metrics: Define a primary success metric and at least 3 guardrails (e.g., refund rate, complaint rate, latency, churn). Specify how each is computed, including windows (e.g., 14-day from first exposure) and exclusion rules.
Validity checks: Describe specific diagnostics for SRM, missing instrumentation, novelty effects, and day-of-week seasonality. For each, state the statistical test or threshold you’ll use and what actions you’d take if it fails.
Bias mitigation: Propose a concrete approach to adjust for the concurrent marketing push (e.g., geo diff-in-diff with ad intensity as a covariate, CUPED with pre-period spend or engagement, inverse propensity weighting using ad impression propensity). Justify trade-offs among these methods.
Power & duration: With baseline 6% conversion, 50/50 split, α=0.05 two-sided, 80% power, and 14-day conversion window, compute the minimum detectable relative lift if you can expose ≈ 2.8M eligible users over the test (assume independence and a binomial variance). Is the test adequately powered? If not, propose changes.
Decision under messiness: Suppose after your adjustments the estimated lift in 14-day conversion is +3.5% (95% CI: −0.5%, +7.5%), ARPU is +1.2%, and refund rate increases by +1.1pp. Would you recommend launch, guardrail-triggered rollback, or extended test? State the exact thresholds that drive your decision and how you’d communicate the trade-offs to marketing and product.

Design an A/B launch amid marketing confounds

Company: Chime

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: medium

Interview Round: Technical Screen

Randomization & exposure: What unit (user, device, geo, or hybrid) and exposure rule would you choose to minimize contamination and noncompliance? How would you handle users who see ads but never get randomized, or who cross over variants across platforms?
Metrics: Define a primary success metric and at least 3 guardrails (e.g., refund rate, complaint rate, latency, churn). Specify how each is computed, including windows (e.g., 14-day from first exposure) and exclusion rules.
Validity checks: Describe specific diagnostics for SRM, missing instrumentation, novelty effects, and day-of-week seasonality. For each, state the statistical test or threshold you’ll use and what actions you’d take if it fails.
Bias mitigation: Propose a concrete approach to adjust for the concurrent marketing push (e.g., geo diff-in-diff with ad intensity as a covariate, CUPED with pre-period spend or engagement, inverse propensity weighting using ad impression propensity). Justify trade-offs among these methods.
Power & duration: With baseline 6% conversion, 50/50 split, α=0.05 two-sided, 80% power, and 14-day conversion window, compute the minimum detectable relative lift if you can expose ≈ 2.8M eligible users over the test (assume independence and a binomial variance). Is the test adequately powered? If not, propose changes.
Decision under messiness: Suppose after your adjustments the estimated lift in 14-day conversion is +3.5% (95% CI: −0.5%, +7.5%), ARPU is +1.2%, and refund rate increases by +1.1pp. Would you recommend launch, guardrail-triggered rollback, or extended test? State the exact thresholds that drive your decision and how you’d communicate the trade-offs to marketing and product.

Design an A/B launch amid marketing confounds

Quick Overview

Design an A/B launch amid marketing confounds

Write your answer

Design an A/B launch amid marketing confounds

Quick Overview

Design an A/B launch amid marketing confounds

Write your answer