How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a medium difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Lyft.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Lyft during technical interviews.

Investigate a 7% Monthly Active Riders Drop and a 20% Wait-Time Increase

Q: Investigate a 7% Monthly Active Riders Drop and a 20% Wait-Time Increase

This question evaluates a data scientist's ability to diagnose simultaneous metric regressions in a marketplace product using structured root-cause analysis. It tests decomposition of engagement and reliability metrics, supply-demand reasoning, and the ability to separate correlation from causation — skills central to analytics and experimentation roles.

You are a data scientist on the rider growth team at a ride-hailing company. During a routine business review, two separate metric movements are flagged for the most recent month. You are asked to walk the interviewer through how you would investigate each one and arrive at a root cause.

Movement A — Engagement drop. Monthly Active Riders (MAR), defined as the count of distinct riders who completed at least one paid ride in a calendar month, fell 7% month over month.

Movement B — Reliability regression. Average ride wait time (the elapsed time from a rider requesting a ride to driver pickup) increased 20% month over month across the marketplace.

For each movement, describe your end-to-end investigation: how you would confirm the metric is real, decompose it, form and test hypotheses, and decide on a conclusion and next steps.

Constraints & Assumptions

The company operates in many metros; metrics are available daily and can be sliced by metro, platform (iOS/Android), rider tenure cohort, acquisition channel, and product (standard vs shared vs premium).
"Month over month" compares the most recent full calendar month to the one before it.
Assume both metrics are computed from the same event pipeline; you can query raw event logs.
A 7% MAR drop and a 20% wait-time increase are both large enough that they are very unlikely to be pure noise at company scale, but you should still quantify expected month-to-month variation.

Clarifying Questions to Ask

What is the normal month-over-month variance of these metrics, and are these moves outside the historical control band?
Is the drop concentrated in specific metros, platforms, products, or rider tenure cohorts, or is it uniform across the board?
Did any product release, pricing change, ETA/matching-algorithm change, or marketing budget shift ship near the start of the affected month?
Were there logging, app-version, or metric-definition changes in the window?
Are there known external events (weather, holidays, a competitor promotion, a city regulation) overlapping the period?
For wait time specifically: did total ride demand, driver supply (hours online), or the matching radius/dispatch logic change?

What a Strong Answer Covers

Metric validation first : explicitly rules out data-quality and definition artifacts before chasing real causes.
Structured decomposition : breaks MAR into new/retained/resurrected/churned and into segments; breaks wait time into supply, demand, and matching-efficiency components rather than treating it as a monolith.
Hypothesis prioritization : generates a ranked list of candidate causes and uses the shape of the movement (which segments, which days, abrupt vs gradual) to prioritize, instead of testing hypotheses at random.
Quantified attribution : estimates how much of the 7% / 20% each candidate explains, and checks whether the pieces sum to the whole.
Confounding awareness : separates correlation from cause (e.g., wait time rising because demand surged vs because supply fell), and watches for Simpson's paradox when aggregating across metros.
Actionable conclusion : ends with a most-likely root cause, the residual unexplained portion, a confidence level, and a concrete recommendation or follow-up experiment.

Follow-up Questions

Suppose the 7% MAR drop is entirely explained by a single large metro. How would that change your conclusion and your recommendation versus a uniform drop?
The two metrics may be linked: could a 20% wait-time increase cause part of the MAR decline? How would you test for and quantify that causal path?
If a pricing change shipped mid-month in only half the metros, how would you use that natural rollout to estimate its causal effect on both metrics?
After you ship a fix, how would you confirm the metric recovers and that the recovery is attributable to your fix rather than reversion to the mean?

Movement A — Engagement drop. Monthly Active Riders (MAR), defined as the count of distinct riders who completed at least one paid ride in a calendar month, fell 7% month over month.

Movement B — Reliability regression. Average ride wait time (the elapsed time from a rider requesting a ride to driver pickup) increased 20% month over month across the marketplace.

For each movement, describe your end-to-end investigation: how you would confirm the metric is real, decompose it, form and test hypotheses, and decide on a conclusion and next steps.

Constraints & Assumptions

The company operates in many metros; metrics are available daily and can be sliced by metro, platform (iOS/Android), rider tenure cohort, acquisition channel, and product (standard vs shared vs premium).
"Month over month" compares the most recent full calendar month to the one before it.
Assume both metrics are computed from the same event pipeline; you can query raw event logs.
A 7% MAR drop and a 20% wait-time increase are both large enough that they are very unlikely to be pure noise at company scale, but you should still quantify expected month-to-month variation.

Clarifying Questions to Ask

What is the normal month-over-month variance of these metrics, and are these moves outside the historical control band?
Is the drop concentrated in specific metros, platforms, products, or rider tenure cohorts, or is it uniform across the board?
Did any product release, pricing change, ETA/matching-algorithm change, or marketing budget shift ship near the start of the affected month?
Were there logging, app-version, or metric-definition changes in the window?
Are there known external events (weather, holidays, a competitor promotion, a city regulation) overlapping the period?
For wait time specifically: did total ride demand, driver supply (hours online), or the matching radius/dispatch logic change?

What a Strong Answer Covers

Metric validation first : explicitly rules out data-quality and definition artifacts before chasing real causes.
Structured decomposition : breaks MAR into new/retained/resurrected/churned and into segments; breaks wait time into supply, demand, and matching-efficiency components rather than treating it as a monolith.
Hypothesis prioritization : generates a ranked list of candidate causes and uses the shape of the movement (which segments, which days, abrupt vs gradual) to prioritize, instead of testing hypotheses at random.
Quantified attribution : estimates how much of the 7% / 20% each candidate explains, and checks whether the pieces sum to the whole.
Confounding awareness : separates correlation from cause (e.g., wait time rising because demand surged vs because supply fell), and watches for Simpson's paradox when aggregating across metros.
Actionable conclusion : ends with a most-likely root cause, the residual unexplained portion, a confidence level, and a concrete recommendation or follow-up experiment.

Follow-up Questions

Suppose the 7% MAR drop is entirely explained by a single large metro. How would that change your conclusion and your recommendation versus a uniform drop?
The two metrics may be linked: could a 20% wait-time increase cause part of the MAR decline? How would you test for and quantify that causal path?
If a pricing change shipped mid-month in only half the metros, how would you use that natural rollout to estimate its causal effect on both metrics?
After you ship a fix, how would you confirm the metric recovers and that the recovery is attributable to your fix rather than reversion to the mean?

Investigate a 7% Monthly Active Riders Drop and a 20% Wait-Time Increase

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP

Investigate a 7% Monthly Active Riders Drop and a 20% Wait-Time Increase

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP