On 2025-08-25, Hulu shipped a new homepage ranking model to 100% of iOS traffic. Between 2025-08-25 and 2025-08-31, average watch-time per session among new users on iOS fell by 2.7%, while Android was flat. A major content premiere occurred on 2025-08-29, and there was a 20-minute payments outage on 2025-08-27 affecting subscription starts. No holdout was configured. Within 48 hours, recommend whether to roll back, continue, or mitigate. Answer the following: - Define the primary decision metric(s) and guardrails (e.g., session starts, error rate, churn proxy). Set concrete decision thresholds. - Propose an identification strategy without a built-in holdout: (1) a difference-in-differences using Android as a control with a pre-period 2025-08-18–2025-08-24; (2) a geo-based synthetic control within iOS. Specify assumptions (parallel trends, no interference), diagnostics, and falsification tests. - Adjust for seasonality, the 2025-08-29 premiere, and the 2025-08-27 outage. What fixed effects or controls do you include? How do you handle heterogeneity by country, platform version, and new vs returning users? - Quantify impact with confidence intervals and provide a back-of-the-envelope weekly revenue effect (state assumptions for ad- and subscription-revenue paths). - Outline a fast remediation plan (e.g., revert, throttle, feature flags, guardrail monitors), and a forward plan for a proper experiment design in September (holdouts, pre-registration, CUPED, exposure logging).

This question evaluates a data scientist's skills in causal inference, experiment analytics, metric definition and guardrails, confounder adjustment, and rapid decision-making under time pressure.

How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during HR Screen rounds at Disney.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Disney during technical interviews.

Diagnose and decide on watch-time drop | Disney Interview Question

Hulu iOS Homepage Ranking Model Rollout: Causal Read and Decision Within 48 Hours

Context

On 2025-08-25, a new homepage ranking model was shipped to 100% of iOS traffic.
Observation (2025-08-25 to 2025-08-31): average watch-time per session among iOS new users declined by 2.7%; Android was flat.
Confounders:
- 2025-08-29: major content premiere.
- 2025-08-27: 20-minute payments outage affecting subscription starts.
No built-in holdout.

You have 48 hours to recommend: roll back, continue, or mitigate.

Tasks

Define the primary decision metric(s) and guardrails (e.g., session starts, error rate, churn proxy) and set concrete decision thresholds.
Propose an identification strategy without a holdout:
- (a) Difference-in-differences using Android as control with pre-period 2025-08-18–2025-08-24.
- (b) Geo-based synthetic control within iOS.
- State assumptions (parallel trends, no interference), diagnostics, and falsification tests.
Adjust for seasonality, the 2025-08-29 premiere, and the 2025-08-27 outage. Specify fixed effects/controls. Address heterogeneity by country, platform version, and new vs returning users.
Quantify impact with confidence intervals and provide a back-of-the-envelope weekly revenue effect (state assumptions for ad- and subscription-revenue paths).
Outline a fast remediation plan (e.g., revert, throttle, feature flags, guardrail monitors), and a forward plan for a proper September experiment (holdouts, pre-registration, CUPED, exposure logging).

Context

On 2025-08-25, a new homepage ranking model was shipped to 100% of iOS traffic.

Observation (2025-08-25 to 2025-08-31): average watch-time per session among iOS new users declined by 2.7%; Android was flat.

Confounders:

2025-08-29: major content premiere.
2025-08-27: 20-minute payments outage affecting subscription starts.

No built-in holdout.

You have 48 hours to recommend: roll back, continue, or mitigate.

Tasks

Define the primary decision metric(s) and guardrails (e.g., session starts, error rate, churn proxy) and set concrete decision thresholds.

Propose an identification strategy without a holdout:

(a) Difference-in-differences using Android as control with pre-period 2025-08-18–2025-08-24.
(b) Geo-based synthetic control within iOS.
State assumptions (parallel trends, no interference), diagnostics, and falsification tests.

Adjust for seasonality, the 2025-08-29 premiere, and the 2025-08-27 outage. Specify fixed effects/controls. Address heterogeneity by country, platform version, and new vs returning users.

Quantify impact with confidence intervals and provide a back-of-the-envelope weekly revenue effect (state assumptions for ad- and subscription-revenue paths).

Outline a fast remediation plan (e.g., revert, throttle, feature flags, guardrail monitors), and a forward plan for a proper September experiment (holdouts, pre-registration, CUPED, exposure logging).

Diagnose and decide on watch-time drop

Quick Overview

Diagnose and decide on watch-time drop

Hulu iOS Homepage Ranking Model Rollout: Causal Read and Decision Within 48 Hours

Context

Tasks

Write your answer

Diagnose and decide on watch-time drop

Quick Overview

Diagnose and decide on watch-time drop

Hulu iOS Homepage Ranking Model Rollout: Causal Read and Decision Within 48 Hours

Context

Tasks

Write your answer