You work on a two‑sided travel search marketplace and product wants to add a “High Wi‑Fi” badge/filter in the search bar to help remote workers. Recommend whether to launch based on an experiment you design. Be specific and address the following: 1) Randomization under interference: choose and justify one design (user‑level A/B, market‑day switchback, or geo/cluster randomization). Explicitly discuss potential SUTVA violations (demand spillovers, supply re‑ranking, word‑of‑mouth) and how your design mitigates them. 2) Metrics: pre‑specify primary success metrics (e.g., GMV/visitor, bookings/1k searches, conversion%, AOV) and guardrails (bounce%, latency p95, partner cancellations, non‑Wi‑Fi listing CTR). Define exactly how each is computed and at what aggregation level. 3) Intermediate metrics if topline is flat: badge/filter usage rate, CTR to Wi‑Fi listings, dwell time, queries per session, supply coverage with Wi‑Fi badge, search refinement rate. 4) Power, sample size, and runtime: compute MDE and duration using these constraints and baselines: daily active searchers=2,000,000; baseline search→booking conversion=3.0%; AOV=$120; variable margin=12%; treatment share=50%; desired power=0.80; alpha=0.05; cluster choice implies ICC=0.02 with average market‑day cluster size=20,000 visitors; stop only at full weeks. Product expects a 0.20–0.40 percentage‑point lift in conversion—show whether this is detectable and how many weeks are needed under your chosen design (include design effect if clustered). 5) Analysis plan: variance reduction (e.g., CUPED/stratification), outlier handling, heterogeneity (remote‑heavy markets, mobile vs desktop, business vs leisure), pre‑commit stopping rule, and how you will check randomization balance and interference diagnostics. 6) Decision rule to NPV: translate estimated effects into incremental margin dollars. Assume a one‑time engineering cost of $400k and an ongoing partner churn risk if CTR to non‑Wi‑Fi listings drops >5%. State precise launch/hold/kill thresholds. 7) If the A/B is impractical (e.g., heavy interference), propose a quasi‑experiment (staggered geo roll‑out with difference‑in‑differences and negative‑control outcomes). Specify the exact calendar, identification assumptions, and robustness checks. Deliver: a written design, formulas used, and the numeric recommendation to launch or not under plausible effect sizes.

This question evaluates a data scientist's proficiency in experimental design, causal inference under interference (including SUTVA risks), metric specification, power and sample‑size computation, analysis planning, and translating treatment effects into financial decision rules.

How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a Medium difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Airbnb.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Airbnb during technical interviews.

Design a network-aware Wi‑Fi badge experiment

You work on a two‑sided travel search marketplace and product wants to add a “High Wi‑Fi” badge/filter in the search bar to help remote workers. Recommend whether to launch based on an experiment you design. Be specific and address the following:

Randomization under interference: choose and justify one design (user‑level A/B, market‑day switchback, or geo/cluster randomization). Explicitly discuss potential SUTVA violations (demand spillovers, supply re‑ranking, word‑of‑mouth) and how your design mitigates them.
Metrics: pre‑specify primary success metrics (e.g., GMV/visitor, bookings/1k searches, conversion%, AOV) and guardrails (bounce%, latency p95, partner cancellations, non‑Wi‑Fi listing CTR). Define exactly how each is computed and at what aggregation level.
Intermediate metrics if topline is flat: badge/filter usage rate, CTR to Wi‑Fi listings, dwell time, queries per session, supply coverage with Wi‑Fi badge, search refinement rate.
Power, sample size, and runtime: compute MDE and duration using these constraints and baselines: daily active searchers=2,000,000; baseline search→booking conversion=3.0%; AOV=$120; variable margin=12%; treatment share=50%; desired power=0.80; alpha=0.05; cluster choice implies ICC=0.02 with average market‑day cluster size=20,000 visitors; stop only at full weeks. Product expects a 0.20–0.40 percentage‑point lift in conversion—show whether this is detectable and how many weeks are needed under your chosen design (include design effect if clustered).
Analysis plan: variance reduction (e.g., CUPED/stratification), outlier handling, heterogeneity (remote‑heavy markets, mobile vs desktop, business vs leisure), pre‑commit stopping rule, and how you will check randomization balance and interference diagnostics.
Decision rule to NPV: translate estimated effects into incremental margin dollars. Assume a one‑time engineering cost of $400k and an ongoing partner churn risk if CTR to non‑Wi‑Fi listings drops >5%. State precise launch/hold/kill thresholds.
If the A/B is impractical (e.g., heavy interference), propose a quasi‑experiment (staggered geo roll‑out with difference‑in‑differences and negative‑control outcomes). Specify the exact calendar, identification assumptions, and robustness checks. Deliver: a written design, formulas used, and the numeric recommendation to launch or not under plausible effect sizes.

Randomization under interference: choose and justify one design (user‑level A/B, market‑day switchback, or geo/cluster randomization). Explicitly discuss potential SUTVA violations (demand spillovers, supply re‑ranking, word‑of‑mouth) and how your design mitigates them.
Metrics: pre‑specify primary success metrics (e.g., GMV/visitor, bookings/1k searches, conversion%, AOV) and guardrails (bounce%, latency p95, partner cancellations, non‑Wi‑Fi listing CTR). Define exactly how each is computed and at what aggregation level.
Intermediate metrics if topline is flat: badge/filter usage rate, CTR to Wi‑Fi listings, dwell time, queries per session, supply coverage with Wi‑Fi badge, search refinement rate.
Power, sample size, and runtime: compute MDE and duration using these constraints and baselines: daily active searchers=2,000,000; baseline search→booking conversion=3.0%; AOV=$120; variable margin=12%; treatment share=50%; desired power=0.80; alpha=0.05; cluster choice implies ICC=0.02 with average market‑day cluster size=20,000 visitors; stop only at full weeks. Product expects a 0.20–0.40 percentage‑point lift in conversion—show whether this is detectable and how many weeks are needed under your chosen design (include design effect if clustered).
Analysis plan: variance reduction (e.g., CUPED/stratification), outlier handling, heterogeneity (remote‑heavy markets, mobile vs desktop, business vs leisure), pre‑commit stopping rule, and how you will check randomization balance and interference diagnostics.
Decision rule to NPV: translate estimated effects into incremental margin dollars. Assume a one‑time engineering cost of $400k and an ongoing partner churn risk if CTR to non‑Wi‑Fi listings drops >5%. State precise launch/hold/kill thresholds.
If the A/B is impractical (e.g., heavy interference), propose a quasi‑experiment (staggered geo roll‑out with difference‑in‑differences and negative‑control outcomes). Specify the exact calendar, identification assumptions, and robustness checks. Deliver: a written design, formulas used, and the numeric recommendation to launch or not under plausible effect sizes.

Design a network-aware Wi‑Fi badge experiment

Quick Overview

Design a network-aware Wi‑Fi badge experiment

Write your answer

Design a network-aware Wi‑Fi badge experiment

Quick Overview

Design a network-aware Wi‑Fi badge experiment

Write your answer