PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/Airbnb

Design a network-aware Wi‑Fi badge experiment

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's proficiency in experimental design, causal inference under interference (including SUTVA risks), metric specification, power and sample‑size computation, analysis planning, and translating treatment effects into financial decision rules.

  • Medium
  • Airbnb
  • Analytics & Experimentation
  • Data Scientist

Design a network-aware Wi‑Fi badge experiment

Company: Airbnb

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: Medium

Interview Round: Technical Screen

You work on a two‑sided travel search marketplace and product wants to add a “High Wi‑Fi” badge/filter in the search bar to help remote workers. Recommend whether to launch based on an experiment you design. Be specific and address the following: 1) Randomization under interference: choose and justify one design (user‑level A/B, market‑day switchback, or geo/cluster randomization). Explicitly discuss potential SUTVA violations (demand spillovers, supply re‑ranking, word‑of‑mouth) and how your design mitigates them. 2) Metrics: pre‑specify primary success metrics (e.g., GMV/visitor, bookings/1k searches, conversion%, AOV) and guardrails (bounce%, latency p95, partner cancellations, non‑Wi‑Fi listing CTR). Define exactly how each is computed and at what aggregation level. 3) Intermediate metrics if topline is flat: badge/filter usage rate, CTR to Wi‑Fi listings, dwell time, queries per session, supply coverage with Wi‑Fi badge, search refinement rate. 4) Power, sample size, and runtime: compute MDE and duration using these constraints and baselines: daily active searchers=2,000,000; baseline search→booking conversion=3.0%; AOV=$120; variable margin=12%; treatment share=50%; desired power=0.80; alpha=0.05; cluster choice implies ICC=0.02 with average market‑day cluster size=20,000 visitors; stop only at full weeks. Product expects a 0.20–0.40 percentage‑point lift in conversion—show whether this is detectable and how many weeks are needed under your chosen design (include design effect if clustered). 5) Analysis plan: variance reduction (e.g., CUPED/stratification), outlier handling, heterogeneity (remote‑heavy markets, mobile vs desktop, business vs leisure), pre‑commit stopping rule, and how you will check randomization balance and interference diagnostics. 6) Decision rule to NPV: translate estimated effects into incremental margin dollars. Assume a one‑time engineering cost of $400k and an ongoing partner churn risk if CTR to non‑Wi‑Fi listings drops >5%. State precise launch/hold/kill thresholds. 7) If the A/B is impractical (e.g., heavy interference), propose a quasi‑experiment (staggered geo roll‑out with difference‑in‑differences and negative‑control outcomes). Specify the exact calendar, identification assumptions, and robustness checks. Deliver: a written design, formulas used, and the numeric recommendation to launch or not under plausible effect sizes.

Quick Answer: This question evaluates a data scientist's proficiency in experimental design, causal inference under interference (including SUTVA risks), metric specification, power and sample‑size computation, analysis planning, and translating treatment effects into financial decision rules.

Related Interview Questions

  • Design and Analyze Airbnb Locker Experiment - Airbnb (medium)
  • Design an A/B test with causal inference - Airbnb (hard)
  • Design robust primary and guardrail metrics - Airbnb (hard)
  • Analyze A/B test with rigorous diagnostics - Airbnb (hard)
  • Estimate impact of global launch without holdout - Airbnb (hard)
Airbnb logo
Airbnb
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
9
0

You work on a two‑sided travel search marketplace and product wants to add a “High Wi‑Fi” badge/filter in the search bar to help remote workers. Recommend whether to launch based on an experiment you design. Be specific and address the following:

  1. Randomization under interference: choose and justify one design (user‑level A/B, market‑day switchback, or geo/cluster randomization). Explicitly discuss potential SUTVA violations (demand spillovers, supply re‑ranking, word‑of‑mouth) and how your design mitigates them.
  2. Metrics: pre‑specify primary success metrics (e.g., GMV/visitor, bookings/1k searches, conversion%, AOV) and guardrails (bounce%, latency p95, partner cancellations, non‑Wi‑Fi listing CTR). Define exactly how each is computed and at what aggregation level.
  3. Intermediate metrics if topline is flat: badge/filter usage rate, CTR to Wi‑Fi listings, dwell time, queries per session, supply coverage with Wi‑Fi badge, search refinement rate.
  4. Power, sample size, and runtime: compute MDE and duration using these constraints and baselines: daily active searchers=2,000,000; baseline search→booking conversion=3.0%; AOV=$120; variable margin=12%; treatment share=50%; desired power=0.80; alpha=0.05; cluster choice implies ICC=0.02 with average market‑day cluster size=20,000 visitors; stop only at full weeks. Product expects a 0.20–0.40 percentage‑point lift in conversion—show whether this is detectable and how many weeks are needed under your chosen design (include design effect if clustered).
  5. Analysis plan: variance reduction (e.g., CUPED/stratification), outlier handling, heterogeneity (remote‑heavy markets, mobile vs desktop, business vs leisure), pre‑commit stopping rule, and how you will check randomization balance and interference diagnostics.
  6. Decision rule to NPV: translate estimated effects into incremental margin dollars. Assume a one‑time engineering cost of $400k and an ongoing partner churn risk if CTR to non‑Wi‑Fi listings drops >5%. State precise launch/hold/kill thresholds.
  7. If the A/B is impractical (e.g., heavy interference), propose a quasi‑experiment (staggered geo roll‑out with difference‑in‑differences and negative‑control outcomes). Specify the exact calendar, identification assumptions, and robustness checks. Deliver: a written design, formulas used, and the numeric recommendation to launch or not under plausible effect sizes.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Airbnb•More Data Scientist•Airbnb Data Scientist•Airbnb Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.