PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Statistics & Math/Pinterest

Design rigorous A/B test and causal analysis

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in experimental design, sample-size and power calculations, variance-reduction methods (e.g., CUPED), sequential testing and alpha spending, clustering and interference effects, SRM checks, and causal identification strategies such as DID, IV, and RDD within the Statistics & Math domain.

  • hard
  • Pinterest
  • Statistics & Math
  • Data Scientist

Design rigorous A/B test and causal analysis

Company: Pinterest

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: Onsite

Answer all parts with formulas, numeric results, and assumptions: A) Sample size: Baseline conversion p0=0.045, target MDE=+7% relative (p1=0.045*1.07), two-sided alpha=0.05, power=0.90. Compute per-variant sample size for a standard two-proportion z-test. Show the z-scores used and the pooled variance assumption. B) Duration: With 1.2M daily visitors, 60/40 traffic split (A/B), and 80% eligibility, how many calendar days are required to reach the sample size from (A)? State any adjustments for repeat visitors and overlap with other experiments. C) Variance reduction: If a pre-experiment covariate has R^2=0.20 with the outcome, quantify the effective MDE or sample-size reduction when using CUPED. Explain when CUPED increases bias (e.g., covariate shift). D) Sequential testing: You plan daily peeks for 21 days. Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien-Fleming). Specify spending function and the final critical z. Explain pros/cons vs always-valid sequential methods (SPRT/e-values). E) Interference and clustering: When randomizing by user causes cross-unit spillovers, propose a cluster design (e.g., geo or traffic-bucket). Compute design effect for ICC=0.02 with average cluster size m=5 and m=50. How does this change the sample size? F) SRM check: On day 3 you observe 110,000 users in A and 90,000 in B (expected 60/40 from eligible 200,000). Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant? G) Causal inference: The team ran an observational study with a strong pre-period trend. Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions (e.g., exclusion restriction for IV; continuity for RDD), and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity to unobserved confounding).

Quick Answer: This question evaluates a data scientist's competency in experimental design, sample-size and power calculations, variance-reduction methods (e.g., CUPED), sequential testing and alpha spending, clustering and interference effects, SRM checks, and causal identification strategies such as DID, IV, and RDD within the Statistics & Math domain.

Related Interview Questions

  • Explain BLS vs CLS; compute t-stats - Pinterest (Medium)
  • Estimate billboard reach and impressions - Pinterest (hard)
  • Analyze survey with gender imbalance - Pinterest (Hard)
  • Determine Appropriate Statistical Test for Comparing Means - Pinterest (medium)
  • Estimate Highway Billboard Impressions Using Traffic Data - Pinterest (medium)
Pinterest logo
Pinterest
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
3
0

Experiment Design and Causal Inference: Multi-part Problem

Context: You are designing a high-traffic web A/B test on a binary conversion metric. Answer each part with formulas, numeric results, and clearly stated assumptions.

A) Sample size

  • Baseline conversion p0 = 0.045
  • Target MDE = +7% relative, so p1 = 0.045 × 1.07
  • Two-sided alpha = 0.05, power = 0.90
  • Compute the per-variant sample size for a standard two-proportion z-test using the pooled variance planning assumption. Show the z-scores used and the variance terms.

B) Duration

  • Daily visitors = 1.2M
  • Traffic split = 60/40 (A/B)
  • Eligibility = 80%
  • Using the sample size from (A), compute calendar days needed. State any adjustments for repeat visitors and overlap with other experiments.

C) Variance reduction (CUPED)

  • A pre-experiment covariate has R^2 = 0.20 with the outcome.
  • Quantify the effective MDE reduction (or equivalently, sample-size reduction) with CUPED. Explain when CUPED can increase bias (e.g., covariate shift).

D) Sequential testing

  • You plan daily peeks for 21 days.
  • Propose an alpha-spending or group-sequential design (e.g., Pocock or O’Brien–Fleming). Specify the spending function and the final critical z. Briefly compare to always-valid sequential methods (SPRT/e-values).

E) Interference and clustering

  • Cross-unit spillovers exist when randomizing by user.
  • Propose a clustered design (e.g., geo or traffic-bucket). Compute the design effect for ICC = 0.02 with average cluster size m = 5 and m = 50, and show how it changes the sample size.

F) SRM check

  • Day 3 observed: A = 110,000 users, B = 90,000 users.
  • Expected from eligible 200,000 with 60/40 split: A = 120,000, B = 80,000.
  • Perform a chi-square goodness-of-fit test and report the p-value. What actions do you take if SRM is significant?

G) Causal inference (observational)

  • The team previously ran an observational study with a strong pre-period trend.
  • Sketch a DAG, choose an identification strategy (DID, IV, or RDD), list required assumptions, and propose concrete robustness checks (placebo tests, pre-trend tests, sensitivity analyses).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Pinterest•More Data Scientist•Pinterest Data Scientist•Pinterest Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.