Measure outage impact; choose fix vs build
Company: Google
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
A major enterprise customer reports frequent Google Meet call drops. Build an end-to-end analysis plan: precisely define a “drop” (e.g., unexpected disconnect requiring rejoin within 60s), list instrumentation (session start/stop, reconnect attempts, ICE state changes, RTT/jitter/packet loss, device/OS/app version, network type/ASN, region), and deduplicate correlated failures; quantify business impact in three layers—meeting-minutes lost, user productivity loss, and account-level retention risk—and estimate marginal impact per 1pp increase in drop rate using matched controls or hierarchical models; determine whether to prioritize fixing the bug or building a new feature by framing expected value = impact × likelihood × duration ÷ effort, including opportunity cost and guardrail metrics (crash rate, CPU, startup latency); check if this is a regression by analyzing by-version/by-region canaries or holdbacks; produce a decision memo with assumptions, sensitivity analysis, and the exact observations that would invalidate your recommendation.
Quick Answer: This question evaluates instrumentation design, telemetry analysis, causal impact estimation, failure deduplication, and decision-making under uncertainty for a Data Scientist, testing competencies in analytics, experimentation, and product-metric prioritization.