How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Uber.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Uber during technical interviews.

Design a robust email A/B test | Uber Interview Question

Quick Overview

This question evaluates a data scientist's competency in experimental design, including randomization and stratification, statistical power and sample-size estimation, primary and guardrail metric definition, sequential monitoring, and data-quality diagnostics for large-scale email A/B tests.

A/B Test Design: New Email Subject Line for Weekly Campaign

You manage a weekly email campaign to 10 million users. Baseline unique click-through rate (CTR) is 3.0% and unsubscribe rate is 0.08% per send. Marketing proposes a new subject line expected to increase CTR by +6% relative (≈ +0.18 percentage points absolute lift, from 3.00% to 3.18%).

Assume a single weekly send, independent Bernoulli outcomes at the user–send level, and that users can receive resends to non-openers and may be eligible for other campaigns in the same week unless controlled.

Design the experiment end-to-end:

Randomization
- What is the randomization unit and what stratification blocks would you use (e.g., locale, device, engagement tier)?
- How do you prevent contamination from resends and cross-campaign overlap in the same week?
Power
- Compute per-arm sample size for α = 0.05 (two-sided) and 80% power for detecting a +0.18 pp absolute lift on CTR from a 3.0% baseline.
- State assumptions and show the formula you would use.
Metrics
- Choose a single primary success metric and at least two guardrail metrics (e.g., unsubscribe rate, spam complaints).
- Define each precisely (numerator/denominator, measurement window) and justify the choice over alternatives like open rate.
Sequential Monitoring
- Leadership wants daily peeks and the ability to stop early for harm.
- Propose a valid plan (e.g., alpha-spending or group-sequential boundaries) that controls Type I error. Specify the monitoring schedule and stopping/continuation rules.
Mid-Experiment Checks (at ~48 hours)
- What diagnostics would you run to detect randomization failure, instrumentation delays, or traffic mix shifts (e.g., weekend effects)?
- How would you correct issues without biasing estimates?
Results Handling
- If an interim look shows negative CTR lift but higher opens, enumerate at least three plausible causes and the next decision (continue, stop-for-harm, or redesign).
- Explain how you would handle intention-to-treat vs per-protocol and what you would report to stakeholders.

Quick Overview

A/B Test Design: New Email Subject Line for Weekly Campaign

Design the experiment end-to-end:

Randomization

What is the randomization unit and what stratification blocks would you use (e.g., locale, device, engagement tier)?
How do you prevent contamination from resends and cross-campaign overlap in the same week?

Power

Compute per-arm sample size for α = 0.05 (two-sided) and 80% power for detecting a +0.18 pp absolute lift on CTR from a 3.0% baseline.
State assumptions and show the formula you would use.

Metrics

Choose a single primary success metric and at least two guardrail metrics (e.g., unsubscribe rate, spam complaints).
Define each precisely (numerator/denominator, measurement window) and justify the choice over alternatives like open rate.

Sequential Monitoring

Leadership wants daily peeks and the ability to stop early for harm.
Propose a valid plan (e.g., alpha-spending or group-sequential boundaries) that controls Type I error. Specify the monitoring schedule and stopping/continuation rules.

Mid-Experiment Checks (at ~48 hours)

What diagnostics would you run to detect randomization failure, instrumentation delays, or traffic mix shifts (e.g., weekend effects)?
How would you correct issues without biasing estimates?

Results Handling

If an interim look shows negative CTR lift but higher opens, enumerate at least three plausible causes and the next decision (continue, stop-for-harm, or redesign).
Explain how you would handle intention-to-treat vs per-protocol and what you would report to stakeholders.

Design a robust email A/B test

Quick Overview

A/B Test Design: New Email Subject Line for Weekly Campaign

Solution

Submit Your Answer to Earn 20XP

Design a robust email A/B test

Quick Overview

A/B Test Design: New Email Subject Line for Weekly Campaign

Solution

Submit Your Answer to Earn 20XP