This question evaluates causal inference and experimental-analysis competencies by asking for estimation of the average treatment effect (ATE) of personalization on minutes streamed, reporting a 95% confidence interval, and reasoning about the use of pre-treatment covariates, and it sits in the Analytics & Experimentation domain for a Data Scientist role. It is commonly asked to assess application of randomized experiment analysis and statistical inference while probing conceptual understanding of causal assumptions and covariate adjustment, testing both conceptual understanding and practical application.
You are given a user-level dataset from an online experiment that randomized personalization (treatment) vs no personalization (control).
Assume one row per user with the following columns:
user_id
(string/int)
treat
(0/1): randomized assignment to personalization
minutes_streamed
(float): total minutes streamed during the 7-day post-assignment window
country
,
device_type
,
tenure_days
,
prior_7d_minutes
,
is_premium
, etc.
Task:
minutes_streamed
.
Assumptions:
treat
as the only post-treatment variable; all other covariates are pre-treatment.