PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/TikTok

Diagnose a watch-time drop and design experiments

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in experimental design, causal inference, metric selection and definition, segmentation, and statistical power/sample-size calculations within the Analytics & Experimentation domain and product analytics for user engagement.

  • hard
  • TikTok
  • Analytics & Experimentation
  • Data Scientist

Diagnose a watch-time drop and design experiments

Company: TikTok

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

A short‑video app rolled out a new preloading strategy to 30% of traffic on 2025-08-20. Among new users (accounts created in the last 7 days), product analytics shows a −6% change in average daily watch time, while crash rate decreased by 0.2 pp and average initial video start latency improved by 80 ms. Design an end‑to‑end approach: (a) Define primary, secondary, and guardrail metrics. Justify each and propose segmentations (e.g., device, network, country, cohort age, entry surface). Specify exact formulas and units. (b) Outline an A/B test plan: unit of randomization, bucketing, exposure rules, test length, and how you would handle heavy‑tailed watch time (e.g., winsorization, log‑transform, robust estimators). (c) Estimate per‑variant sample size for detecting a +3% lift in mean daily watch time with α=0.05 (two‑sided) and 80% power, assuming baseline mean = 14 min, SD = 18 min, independent users, and equal allocation. Show the formula you’d use and what additional assumptions you need if using nonparametric tests. (d) Specify guardrails (e.g., crash rate, time‑to‑first‑frame, data usage) and stopping rules. How would you handle novelty effects, weekday/seasonality, and experiment mis‑randomization checks (e.g., A/A, covariate balance tests)? (e) If the feature was partially rolled out by region before the test, propose a difference‑in‑differences or CUPED/regression‑adjusted analysis. State key identifying assumptions and how you’d validate them.

Quick Answer: This question evaluates a data scientist's competency in experimental design, causal inference, metric selection and definition, segmentation, and statistical power/sample-size calculations within the Analytics & Experimentation domain and product analytics for user engagement.

Related Interview Questions

  • Define Ultra success metrics and detect suspicious transactions - TikTok (easy)
  • Plan DS approach for biker delivery project - TikTok (easy)
  • Define and critique a user activity metric - TikTok (easy)
  • Design and decompose Trust & Safety risk metrics - TikTok (easy)
  • Analyze promo anomaly and design risk guardrails - TikTok (Medium)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
3
0
Loading...

Evaluate a New Preloading Strategy for a Short‑Video App (New Users)

Context

On 2025‑08‑20, a new preloading strategy was rolled out to 30% of traffic. Among new users (accounts created within the last 7 days), product analytics observed:

  • −6% change in average daily watch time
  • Crash rate decreased by 0.2 percentage points
  • Average initial video start latency improved by 80 ms

Design an end‑to‑end approach to properly evaluate and decide whether to ship, iterate, or roll back.

Tasks

(a) Define primary, secondary, and guardrail metrics. Justify each, propose useful segmentations (e.g., device, network, country, cohort age, entry surface), and specify exact formulas and units.

(b) Outline an A/B test plan: unit of randomization, bucketing, exposure rules, test length, and how to handle heavy‑tailed watch time (e.g., winsorization, log‑transform, robust estimators).

(c) Estimate the per‑variant sample size to detect a +3% lift in mean daily watch time with α = 0.05 (two‑sided) and 80% power, assuming baseline mean = 14 min, SD = 18 min, independent users, and equal allocation. Show the formula and any additional assumptions if using nonparametric tests.

(d) Specify guardrails (e.g., crash rate, time‑to‑first‑frame, data usage) and stopping rules. Describe how to handle novelty effects, weekday/seasonality, and experiment mis‑randomization checks (e.g., A/A, covariate balance tests).

(e) If the feature was partially rolled out by region before the test, propose a difference‑in‑differences or CUPED/regression‑adjusted analysis. State key identifying assumptions and how you would validate them.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.