This question evaluates experimental design, statistical inference, causal estimands, subgroup analysis, data quality checks, and stakeholder-facing communication in the Analytics & Experimentation domain, targeting an applied, intermediate-to-senior data scientist.
You are given an offline take-home style project before an onsite interview. You must analyze an A/B test and present your findings in slides.
Assume you receive a user-level dataset experiment_events with:
user_id
(string)
variant
(string; 'control' or 'treatment')
assignment_ts
(timestamp, UTC)
country
(string)
platform
(string; 'ios', 'android', 'web')
exposed
(boolean; whether the user actually saw the new experience)
orders_7d
(int; orders within 7 days after assignment)
revenue_7d
(float; revenue within 7 days after assignment)
support_tickets_7d
(int)
is_new_user
(boolean)
You also have pre-period covariates in user_pre_period:
user_id
orders_28d_pre
(int)
revenue_28d_pre
(float)
Tasks:
exposed = false
for some assigned users) and which estimand you’d report (ITT vs TOT).