PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Analytics & Experimentation/Etsy

Run a clean A/B test for autocomplete

Last updated: Mar 29, 2026

Quick Overview

This question evaluates experimental-design and causal-inference competency for online A/B testing of ML-ranked autocomplete, covering metric formulation, variance-reduction strategies, sample-size computation, and operational validity considerations.

  • hard
  • Etsy
  • Analytics & Experimentation
  • Data Scientist

Run a clean A/B test for autocomplete

Company: Etsy

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

Plan an online controlled experiment to measure the impact of ML-ranked autocomplete on user search satisfaction. Define the unit of randomization (and why), bucketing, exposure rules for typeahead across devices/sessions to prevent contamination, and a traffic ramp plan. Choose primary and guardrail metrics, specify exact formulas (e.g., session-level query success rate, time-to-first-click, p99 latency, error rate), and include CUPED or variance-reduction details. Compute the required sample size to detect a 1.0 percentage-point absolute lift from a 60.0% baseline at alpha=0.05 and power=0.80, and justify sequential monitoring without inflating Type I error. Describe how you will handle novelty and carryover effects, bots, missing logs, seasonality, and heterogeneous treatment effects (locale/device), plus a falsification check and backtest plan.

Quick Answer: This question evaluates experimental-design and causal-inference competency for online A/B testing of ML-ranked autocomplete, covering metric formulation, variance-reduction strategies, sample-size computation, and operational validity considerations.

Related Interview Questions

  • Measure and Improve Listing Quality with Key Metrics - Etsy (medium)
Etsy logo
Etsy
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
2
0
Loading...

Design an online controlled experiment for ML-ranked autocomplete and search satisfaction

You are planning an A/B test to measure the impact of an ML-ranked autocomplete (typeahead) system on user search satisfaction. Provide a complete experimental plan that covers the following:

Design and Assignment

  • Define the unit of randomization and explain why it is appropriate.
  • Describe bucketing (hashing, number of buckets, persistence) and identity resolution.
  • Specify exposure rules across devices/sessions to prevent contamination (e.g., cross-device consistency, cache key isolation, model snapshotting) and eligibility criteria.
  • Propose a traffic ramp plan (phasing, gating criteria, rollback conditions).

Metrics (with exact formulas)

  • Choose one primary metric and at least three guardrail metrics relevant to typeahead and search.
  • Provide exact formulas for:
    • Session-level query success rate (primary)
    • Time to first click (TTFC)
    • p99 typeahead latency
    • Typeahead error rate
    • Any other guardrails you select (e.g., zero-results rate, abandonment, suggestion CTR)
  • Include how you will handle quantiles and clustering.

Variance Reduction and Analysis

  • Describe CUPED or other variance-reduction strategies (variables, window, estimator) and how they apply at the chosen analysis unit.
  • Specify the statistical test, standard errors, and clustering.
  • Justify sequential monitoring without inflating Type I error.

Sample Size

  • Compute the required sample size to detect a 1.0 percentage-point absolute lift from a 60.0% baseline at alpha=0.05 and power=0.80. Show the formula and the numeric result. Discuss any design-effect and CUPED adjustments.

Operational Risks and Validity

  • Explain how you will handle novelty and carryover effects, bots and spam traffic, missing logs, seasonality, and heterogeneous treatment effects (locale, device).
  • Propose at least one falsification (placebo/negative control) check and an offline backtest plan.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Etsy•More Data Scientist•Etsy Data Scientist•Etsy Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.