How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Stripe.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Stripe during technical interviews.

Evaluate a new product with experimentation

Quick Overview

This question evaluates experimental design and causal inference skills, including defining an Overall Evaluation Criterion (OEC) and guardrail metrics, selecting an appropriate test design and ramp/power strategy, specifying quasi-experimental fallbacks, and performing metric diagnostics for a recommendation module in a commerce app.

Evaluation Plan for a New Recommendation Module in a Commerce App

Background

You are asked to evaluate a new recommendation module for a commerce app. The module may exhibit cross-user interference (users influence each other via shared popularity signals, inventory pressure, or model feedback loops) and outcomes can be affected by traffic seasonality and non-stationarity.

Tasks

Define an Overall Evaluation Criterion (OEC) and three guardrail metrics with precise formulas, units, and measurement windows. Example guardrails include churn, latency p95, and complaint rate.
Choose one test design (user-level RCT, geo-clustered RCT, or time-based switchback). Justify your choice with respect to interference, non-stationarity/seasonality, and operational constraints. State any design-specific controls you will use (e.g., model isolation, warmups).
Describe the ramp strategy and pre-registration plan: stopping rules, power target and MDE, variance reduction (e.g., CUPED/covariate adjustment), and small-area risk controls.
If randomization is infeasible, propose a quasi-experimental fallback (synthetic control or difference-in-differences). List the necessary assumptions and the falsification/placebo tests you will run.
Mid-test, suppose the OEC flatlines while add-to-cart rises and conversion falls. Provide a metric-debugging checklist and the exact diagnostic cuts you will request (e.g., by device, geography, new vs. returning, latency buckets). Include relevant equations to localize the issue.

Quick Overview

Tasks

Define an Overall Evaluation Criterion (OEC) and three guardrail metrics with precise formulas, units, and measurement windows. Example guardrails include churn, latency p95, and complaint rate.

Choose one test design (user-level RCT, geo-clustered RCT, or time-based switchback). Justify your choice with respect to interference, non-stationarity/seasonality, and operational constraints. State any design-specific controls you will use (e.g., model isolation, warmups).

Describe the ramp strategy and pre-registration plan: stopping rules, power target and MDE, variance reduction (e.g., CUPED/covariate adjustment), and small-area risk controls.

If randomization is infeasible, propose a quasi-experimental fallback (synthetic control or difference-in-differences). List the necessary assumptions and the falsification/placebo tests you will run.

Mid-test, suppose the OEC flatlines while add-to-cart rises and conversion falls. Provide a metric-debugging checklist and the exact diagnostic cuts you will request (e.g., by device, geography, new vs. returning, latency buckets). Include relevant equations to localize the issue.

Evaluate a new product with experimentation

Quick Overview

Evaluation Plan for a New Recommendation Module in a Commerce App

Background

Tasks

Solution

Submit Your Answer to Earn 20XP

Evaluate a new product with experimentation

Quick Overview

Evaluation Plan for a New Recommendation Module in a Commerce App

Background

Tasks

Solution

Submit Your Answer to Earn 20XP