PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Analytics & Experimentation/Stripe

Evaluate a new product with experimentation

Last updated: Mar 29, 2026

Quick Overview

This question evaluates experimental design and causal inference skills, including defining an Overall Evaluation Criterion (OEC) and guardrail metrics, selecting an appropriate test design and ramp/power strategy, specifying quasi-experimental fallbacks, and performing metric diagnostics for a recommendation module in a commerce app.

  • hard
  • Stripe
  • Analytics & Experimentation
  • Data Scientist

Evaluate a new product with experimentation

Company: Stripe

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

A new recommendation module may cause cross‑user interference and traffic seasonality. Design an evaluation plan. (1) Define an Overall Evaluation Criterion (OEC) for a commerce app and 3 guardrails (e.g., churn, latency p95, complaint rate) with precise formulas and units. (2) Choose a test design (user‑level RCT, geo‑cluster, or time‑based switchback) and justify against interference, non‑stationarity, and operational constraints. (3) Describe ramp strategy and pre‑registration: stopping rules, power target, variance reduction (CUPED/covariate adjustment), and small‑area risk controls. (4) If randomization is infeasible, propose a quasi‑experimental fallback (synthetic control or difference‑in‑differences) and list the assumptions and falsification tests you will run. (5) Suppose mid‑test the OEC flatlines while add‑to‑cart rises and conversion falls; provide a metric‑debugging checklist and the exact cuts you will request to localize the issue (e.g., by device, geography, new vs returning, latency buckets). Be specific and write the equations where relevant.

Quick Answer: This question evaluates experimental design and causal inference skills, including defining an Overall Evaluation Criterion (OEC) and guardrail metrics, selecting an appropriate test design and ramp/power strategy, specifying quasi-experimental fallbacks, and performing metric diagnostics for a recommendation module in a commerce app.

Related Interview Questions

  • Evaluate Stripe Capital Loan Performance - Stripe (medium)
  • Evaluate Stripe Capital Lending Strategy - Stripe (medium)
  • How Should Stripe Capital Be Evaluated? - Stripe (medium)
  • Assess Stripe Capital Strategy - Stripe (medium)
  • Choose target customers and define success metrics - Stripe (Medium)
Stripe logo
Stripe
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
1
0

Evaluation Plan for a New Recommendation Module in a Commerce App

Background

You are asked to evaluate a new recommendation module for a commerce app. The module may exhibit cross-user interference (users influence each other via shared popularity signals, inventory pressure, or model feedback loops) and outcomes can be affected by traffic seasonality and non-stationarity.

Tasks

  1. Define an Overall Evaluation Criterion (OEC) and three guardrail metrics with precise formulas, units, and measurement windows. Example guardrails include churn, latency p95, and complaint rate.
  2. Choose one test design (user-level RCT, geo-clustered RCT, or time-based switchback). Justify your choice with respect to interference, non-stationarity/seasonality, and operational constraints. State any design-specific controls you will use (e.g., model isolation, warmups).
  3. Describe the ramp strategy and pre-registration plan: stopping rules, power target and MDE, variance reduction (e.g., CUPED/covariate adjustment), and small-area risk controls.
  4. If randomization is infeasible, propose a quasi-experimental fallback (synthetic control or difference-in-differences). List the necessary assumptions and the falsification/placebo tests you will run.
  5. Mid-test, suppose the OEC flatlines while add-to-cart rises and conversion falls. Provide a metric-debugging checklist and the exact diagnostic cuts you will request (e.g., by device, geography, new vs. returning, latency buckets). Include relevant equations to localize the issue.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Stripe•More Data Scientist•Stripe Data Scientist•Stripe Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.