This question evaluates causal inference and observational analytics skills—specifically defining an estimand, designing primary, diagnostic, and guardrail metrics, assessing confounding and biases, and communicating uncertainty—within the Analytics & Experimentation domain for a Data Scientist role.
A product team believes a new feature (or a variable you can influence, e.g., enabling notifications, new feed ranking, new UI) changes user time spent in the app.
You have observational + rollout data at the user-day level:
user_id
(string/int)
date
(date, in UTC)
time_spent_min
(float; total minutes spent that day)
exposed
(0/1; whether the user had the feature on that day)
rollout_group
(string; e.g., region / platform / bucket used for rollout)
country
,
platform
,
account_age_days
,
prior_7d_time_spent
,
prior_7d_sessions
, etc.
Assume exposure was not purely random (e.g., phased rollout, targeting, or user self-selection), so confounding is a concern.
exposed
on
time_spent_min
.