How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a easy difficulty Analytics & Experimentation question, commonly asked during HR Screen rounds at Google.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Google during technical interviews.

Design an A/B test for search ranking | Google Interview Question

Q: Design an A/B test for search ranking

This question evaluates a data scientist's competency in online experimentation, causal inference, product analytics, and operational metrics engineering, including A/B test design, metric selection, power/sample-size reasoning, interference mitigation, and analysis planning.

Scenario

You work on a search product and have built a new search ranking/retrieval algorithm (Variant B). The current algorithm is Variant A. You need to design an online experiment to decide whether to launch B.

Task

Design an A/B test plan that covers:

Goal & hypotheses
- What is the primary product goal (e.g., improved relevance, engagement, or long-term retention)?
- State clear hypotheses (e.g., B improves relevance without harming latency).
Experiment design
- Choose the experimental unit (user, device, session, query) and justify it.
- Randomization approach (simple vs. stratified), and key stratification variables (e.g., locale, platform, query category).
- Handling interference/contamination (e.g., cross-device users, cached results, shared accounts).
- Duration and ramp plan (e.g., 1% → 10% → 50%), plus stopping rules.
Metrics
- Propose a primary metric (one) and justify it.
- Propose diagnostic metrics to understand why results change.
- Propose guardrail metrics to prevent regressions.
Consider tradeoffs such as:
- Short-term engagement vs. long-term user value
- Relevance improvements vs. latency / cost
- Click metrics vs. good clicks (dwell time, reformulation)
Power / sample size
- What inputs do you need to compute sample size (baseline rate, variance, MDE, alpha, power)?
- How would you handle multiple comparisons if testing many metrics or segments?
Analysis plan
- How will you compute treatment effects (difference in means/proportions; user-level aggregation)?
- How will you check for sample ratio mismatch (SRM) and data quality issues?
- What key segments would you examine (new vs. returning, head vs. tail queries), and how do you avoid p-hacking?
Risks & pitfalls
- How do you address novelty effects, learning-to-rank feedback loops, or delayed outcomes?
- What would make you decide to not trust the experiment result?

Output

Provide a structured experiment proposal (bulleted plan) including the final metric set and launch decision criteria.

Scenario

Task

Design an A/B test plan that covers:

Goal & hypotheses
- What is the primary product goal (e.g., improved relevance, engagement, or long-term retention)?
- State clear hypotheses (e.g., B improves relevance without harming latency).
Experiment design
- Choose the experimental unit (user, device, session, query) and justify it.
- Randomization approach (simple vs. stratified), and key stratification variables (e.g., locale, platform, query category).
- Handling interference/contamination (e.g., cross-device users, cached results, shared accounts).
- Duration and ramp plan (e.g., 1% → 10% → 50%), plus stopping rules.
Metrics
- Propose a primary metric (one) and justify it.
- Propose diagnostic metrics to understand why results change.
- Propose guardrail metrics to prevent regressions.
Consider tradeoffs such as:
- Short-term engagement vs. long-term user value
- Relevance improvements vs. latency / cost
- Click metrics vs. good clicks (dwell time, reformulation)
Power / sample size
- What inputs do you need to compute sample size (baseline rate, variance, MDE, alpha, power)?
- How would you handle multiple comparisons if testing many metrics or segments?
Analysis plan
- How will you compute treatment effects (difference in means/proportions; user-level aggregation)?
- How will you check for sample ratio mismatch (SRM) and data quality issues?
- What key segments would you examine (new vs. returning, head vs. tail queries), and how do you avoid p-hacking?
Risks & pitfalls
- How do you address novelty effects, learning-to-rank feedback loops, or delayed outcomes?
- What would make you decide to not trust the experiment result?

Output

Provide a structured experiment proposal (bulleted plan) including the final metric set and launch decision criteria.

Design an A/B test for search ranking

Quick Overview

Scenario

Task

Output

Solution

Submit Your Answer to Earn 20XP

Design an A/B test for search ranking

Quick Overview

Scenario

Task

Output

Solution

Submit Your Answer to Earn 20XP