Diagnose a metric drop in search time
Company: Google
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: Medium
Interview Round: Onsite
Over the last 3 calendar months, the metric 'searching time per user per session' dropped by 35%. A teammate proposes modeling two distributions: T1 = time before first successful result and T2 = time before giving up. Critique this and design a robust analysis to find root cause.
- Precisely define the metric and population; handle multi‑tab sessions, background inactivity, and timeouts. Specify inclusion/exclusion rules.
- Explain biases from splitting into T1 and T2: selection bias (excluding no‑success sessions), right‑censoring, competing risks (success vs abandonment), and left‑truncation. How would you detect and correct them?
- Choose methods (e.g., survival/hazard models with censoring; mixture models) and show how to estimate and compare hazards across cohorts (device, locale, query type).
- Rule out non‑product causes: instrumentation changes, seasonality, traffic mix shifts, bot filtering, release flags. List concrete checks and guardrail metrics.
- Build a time‑series decomposition and change‑point analysis; specify covariates and counterfactual baselines.
- Propose a minimal experiment or holdout (e.g., rollback of ranking feature) with success criteria and expected directional outcomes.
Quick Answer: This question evaluates a data scientist's competency in precise metric and population definition, time‑to‑event and survival/hazard modeling, time‑series decomposition and change‑point analysis, instrumentation and traffic forensics, and experimental design for root‑cause identification.