Diagnose a watch-time drop and design experiments

Q: Diagnose a watch-time drop and design experiments

This question evaluates a data scientist's competency in experimental design, causal inference, metric selection and definition, segmentation, and statistical power/sample-size calculations within the Analytics & Experimentation domain and product analytics for user engagement.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

Loading...

Evaluate a New Preloading Strategy for a Short‑Video App (New Users)

Context

On 2025‑08‑20, a new preloading strategy was rolled out to 30% of traffic. Among new users (accounts created within the last 7 days), product analytics observed:

−6% change in average daily watch time
Crash rate decreased by 0.2 percentage points
Average initial video start latency improved by 80 ms

Design an end‑to‑end approach to properly evaluate and decide whether to ship, iterate, or roll back.

Tasks

(a) Define primary, secondary, and guardrail metrics. Justify each, propose useful segmentations (e.g., device, network, country, cohort age, entry surface), and specify exact formulas and units.

(b) Outline an A/B test plan: unit of randomization, bucketing, exposure rules, test length, and how to handle heavy‑tailed watch time (e.g., winsorization, log‑transform, robust estimators).

(c) Estimate the per‑variant sample size to detect a +3% lift in mean daily watch time with α = 0.05 (two‑sided) and 80% power, assuming baseline mean = 14 min, SD = 18 min, independent users, and equal allocation. Show the formula and any additional assumptions if using nonparametric tests.

(d) Specify guardrails (e.g., crash rate, time‑to‑first‑frame, data usage) and stopping rules. Describe how to handle novelty effects, weekday/seasonality, and experiment mis‑randomization checks (e.g., A/A, covariate balance tests).

(e) If the feature was partially rolled out by region before the test, propose a difference‑in‑differences or CUPED/regression‑adjusted analysis. State key identifying assumptions and how you would validate them.

Diagnose a watch-time drop and design experiments

Evaluate a New Preloading Strategy for a Short‑Video App (New Users)

Context

Tasks

Solution

Comments (0)

Diagnose a watch-time drop and design experiments

Overview

Evaluate a New Preloading Strategy for a Short‑Video App (New Users)

Context

Tasks

Solution

Comments (0)