PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Analytics & Experimentation/Chime

Design an A/B launch amid marketing confounds

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's competency in experimental design, causal inference, metric definition, statistical diagnostics, and decision-making under data-quality and marketing confounds within the Analytics & Experimentation domain.

  • Medium
  • Chime
  • Analytics & Experimentation
  • Data Scientist

Design an A/B launch amid marketing confounds

Company: Chime

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: Medium

Interview Round: Technical Screen

You’re running a virtual launch (soft roll-out) of a new fitness tracker product to US+CA users from 2025-08-10 to 2025-08-24 with a 50/50 user-level split (Control vs Variant B). A coordinated marketing push (email + paid + influencers) overlaps the test week, causing contamination and uneven exposure. Data quality quirks emerged: (a) sample ratio is 52:48, not 50:50; (b) purchase events on iOS were dropped for the first 48 hours (2025-08-10 to 2025-08-11); (c) a bug on 2025-08-18 caused an unusual spike in refunds. Pre-period baselines: DAU in eligible geos ≈ 200k; 14-day purchase conversion = 6%; ARPU = $2.20; refund rate = 3% of revenue. Marketing platform logs include user-level ad impressions and email sends. Design a rigorous analysis plan and decision framework that addresses the messy data and marketing confounds: 1) Randomization & exposure: What unit (user, device, geo, or hybrid) and exposure rule would you choose to minimize contamination and noncompliance? How would you handle users who see ads but never get randomized, or who cross over variants across platforms? 2) Metrics: Define a primary success metric and at least 3 guardrails (e.g., refund rate, complaint rate, latency, churn). Specify how each is computed, including windows (e.g., 14-day from first exposure) and exclusion rules. 3) Validity checks: Describe specific diagnostics for SRM, missing instrumentation, novelty effects, and day-of-week seasonality. For each, state the statistical test or threshold you’ll use and what actions you’d take if it fails. 4) Bias mitigation: Propose a concrete approach to adjust for the concurrent marketing push (e.g., geo diff-in-diff with ad intensity as a covariate, CUPED with pre-period spend or engagement, inverse propensity weighting using ad impression propensity). Justify trade-offs among these methods. 5) Power & duration: With baseline 6% conversion, 50/50 split, α=0.05 two-sided, 80% power, and 14-day conversion window, compute the minimum detectable relative lift if you can expose ≈ 2.8M eligible users over the test (assume independence and a binomial variance). Is the test adequately powered? If not, propose changes. 6) Decision under messiness: Suppose after your adjustments the estimated lift in 14-day conversion is +3.5% (95% CI: −0.5%, +7.5%), ARPU is +1.2%, and refund rate increases by +1.1pp. Would you recommend launch, guardrail-triggered rollback, or extended test? State the exact thresholds that drive your decision and how you’d communicate the trade-offs to marketing and product.

Quick Answer: This question evaluates a candidate's competency in experimental design, causal inference, metric definition, statistical diagnostics, and decision-making under data-quality and marketing confounds within the Analytics & Experimentation domain.

Related Interview Questions

  • Decide launch with CPA-profit trade-offs by segment - Chime (Medium)
  • Design and Analyze A/B Test for Recommendation Widget - Chime (hard)
  • Determine Key Metrics for Spend-Tracker Launch Decision - Chime (medium)
  • Design an Effective A/B Test for Algorithm Launch - Chime (medium)
Chime logo
Chime
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
2
0

You’re running a virtual launch (soft roll-out) of a new fitness tracker product to US+CA users from 2025-08-10 to 2025-08-24 with a 50/50 user-level split (Control vs Variant B). A coordinated marketing push (email + paid + influencers) overlaps the test week, causing contamination and uneven exposure. Data quality quirks emerged: (a) sample ratio is 52:48, not 50:50; (b) purchase events on iOS were dropped for the first 48 hours (2025-08-10 to 2025-08-11); (c) a bug on 2025-08-18 caused an unusual spike in refunds. Pre-period baselines: DAU in eligible geos ≈ 200k; 14-day purchase conversion = 6%; ARPU = $2.20; refund rate = 3% of revenue. Marketing platform logs include user-level ad impressions and email sends. Design a rigorous analysis plan and decision framework that addresses the messy data and marketing confounds:

  1. Randomization & exposure: What unit (user, device, geo, or hybrid) and exposure rule would you choose to minimize contamination and noncompliance? How would you handle users who see ads but never get randomized, or who cross over variants across platforms?
  2. Metrics: Define a primary success metric and at least 3 guardrails (e.g., refund rate, complaint rate, latency, churn). Specify how each is computed, including windows (e.g., 14-day from first exposure) and exclusion rules.
  3. Validity checks: Describe specific diagnostics for SRM, missing instrumentation, novelty effects, and day-of-week seasonality. For each, state the statistical test or threshold you’ll use and what actions you’d take if it fails.
  4. Bias mitigation: Propose a concrete approach to adjust for the concurrent marketing push (e.g., geo diff-in-diff with ad intensity as a covariate, CUPED with pre-period spend or engagement, inverse propensity weighting using ad impression propensity). Justify trade-offs among these methods.
  5. Power & duration: With baseline 6% conversion, 50/50 split, α=0.05 two-sided, 80% power, and 14-day conversion window, compute the minimum detectable relative lift if you can expose ≈ 2.8M eligible users over the test (assume independence and a binomial variance). Is the test adequately powered? If not, propose changes.
  6. Decision under messiness: Suppose after your adjustments the estimated lift in 14-day conversion is +3.5% (95% CI: −0.5%, +7.5%), ARPU is +1.2%, and refund rate increases by +1.1pp. Would you recommend launch, guardrail-triggered rollback, or extended test? State the exact thresholds that drive your decision and how you’d communicate the trade-offs to marketing and product.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Chime•More Data Scientist•Chime Data Scientist•Chime Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.