Answer all parts with formulas, numeric results, and assumptions: A) Sample size: Baseline conversion p0=0.045, target MDE=+7% relative (p1=0.045*1.07), two-sided alpha=0.05, power=0.90. Compute per-variant sample size for a standard two-proportion z-test. Show the z-scores used and the pooled variance assumption. B) Duration: With 1.2M daily visitors, 60/40 traffic split (A/B), and 80% eligibility, how many calendar days are required to reach the sample size from (A)? State any adjustments for repeat visitors and overlap with other experiments. C) Variance reduction: If a pre-experiment covariate has R^2=0.20 with the outcome, quantify the effective MDE or sample-size reduction when using CUPED. Explain when CUPED increases bias (e.g., covariate shift). D) Sequential testing: You plan daily peeks for 21 days. Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien-Fleming). Specify spending function and the final critical z. Explain pros/cons vs always-valid sequential methods (SPRT/e-values). E) Interference and clustering: When randomizing by user causes cross-unit spillovers, propose a cluster design (e.g., geo or traffic-bucket). Compute design effect for ICC=0.02 with average cluster size m=5 and m=50. How does this change the sample size? F) SRM check: On day 3 you observe 110,000 users in A and 90,000 in B (expected 60/40 from eligible 200,000). Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant? G) Causal inference: The team ran an observational study with a strong pre-period trend. Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions (e.g., exclusion restriction for IV; continuity for RDD), and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity to unobserved confounding).

This question evaluates a data scientist's competency in experimental design, sample-size and power calculations, variance-reduction methods (e.g., CUPED), sequential testing and alpha spending, clustering and interference effects, SRM checks, and causal identification strategies such as DID, IV, and RDD within the Statistics & Math domain.

How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

What difficulty level is this interview question?

This is a hard difficulty Statistics & Math question, commonly asked during Onsite rounds at Pinterest.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Pinterest during technical interviews.

Design rigorous A/B test and causal analysis | Pinterest Interview Question

Experiment Design and Causal Inference: Multi-part Problem

Context: You are designing a high-traffic web A/B test on a binary conversion metric. Answer each part with formulas, numeric results, and clearly stated assumptions.

A) Sample size

Baseline conversion p0 = 0.045
Target MDE = +7% relative, so p1 = 0.045 × 1.07
Two-sided alpha = 0.05, power = 0.90
Compute the per-variant sample size for a standard two-proportion z-test using the pooled variance planning assumption. Show the z-scores used and the variance terms.

B) Duration

Daily visitors = 1.2M
Traffic split = 60/40 (A/B)
Eligibility = 80%
Using the sample size from (A), compute calendar days needed. State any adjustments for repeat visitors and overlap with other experiments.

C) Variance reduction (CUPED)

A pre-experiment covariate has R^2 = 0.20 with the outcome.
Quantify the effective MDE reduction (or equivalently, sample-size reduction) with CUPED. Explain when CUPED can increase bias (e.g., covariate shift).

D) Sequential testing

You plan daily peeks for 21 days.
Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien–Fleming). Specify the spending function and the final critical z. Briefly compare to always-valid sequential methods (SPRT/e-values).

E) Interference and clustering

Cross-unit spillovers exist when randomizing by user.
Propose a clustered design (e.g., geo or traffic-bucket). Compute the design effect for ICC = 0.02 with average cluster size m = 5 and m = 50, and show how it changes the sample size.

F) SRM check

Day 3 observed: A = 110,000 users, B = 90,000 users.
Expected from eligible 200,000 with 60/40 split: A = 120,000, B = 80,000.
Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant?

G) Causal inference (observational)

The team previously ran an observational study with a strong pre-period trend.
Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions, and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity analyses).

Experiment Design and Causal Inference: Multi-part Problem

Context: You are designing a high-traffic web A/B test on a binary conversion metric. Answer each part with formulas, numeric results, and clearly stated assumptions.

A) Sample size

Baseline conversion p0 = 0.045
Target MDE = +7% relative, so p1 = 0.045 × 1.07
Two-sided alpha = 0.05, power = 0.90
Compute the per-variant sample size for a standard two-proportion z-test using the pooled variance planning assumption. Show the z-scores used and the variance terms.

B) Duration

Daily visitors = 1.2M
Traffic split = 60/40 (A/B)
Eligibility = 80%
Using the sample size from (A), compute calendar days needed. State any adjustments for repeat visitors and overlap with other experiments.

C) Variance reduction (CUPED)

A pre-experiment covariate has R^2 = 0.20 with the outcome.
Quantify the effective MDE reduction (or equivalently, sample-size reduction) with CUPED. Explain when CUPED can increase bias (e.g., covariate shift).

D) Sequential testing

You plan daily peeks for 21 days.
Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien–Fleming). Specify the spending function and the final critical z. Briefly compare to always-valid sequential methods (SPRT/e-values).

E) Interference and clustering

Cross-unit spillovers exist when randomizing by user.
Propose a clustered design (e.g., geo or traffic-bucket). Compute the design effect for ICC = 0.02 with average cluster size m = 5 and m = 50, and show how it changes the sample size.

F) SRM check

Day 3 observed: A = 110,000 users, B = 90,000 users.
Expected from eligible 200,000 with 60/40 split: A = 120,000, B = 80,000.
Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant?

G) Causal inference (observational)

The team previously ran an observational study with a strong pre-period trend.
Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions, and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity analyses).

Design rigorous A/B test and causal analysis

Quick Overview

Design rigorous A/B test and causal analysis

Experiment Design and Causal Inference: Multi-part Problem

Write your answer

Design rigorous A/B test and causal analysis

Quick Overview

Design rigorous A/B test and causal analysis

Experiment Design and Causal Inference: Multi-part Problem

Write your answer