How to estimate feature impact on usage time
Company: Roblox
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: easy
Interview Round: Technical Screen
## Problem
A product team believes a new feature (or a variable you can influence, e.g., enabling notifications, new feed ranking, new UI) changes **user time spent** in the app.
You have observational + rollout data at the user-day level:
- `user_id` (string/int)
- `date` (date, in UTC)
- `time_spent_min` (float; total minutes spent that day)
- `exposed` (0/1; whether the user had the feature on that day)
- `rollout_group` (string; e.g., region / platform / bucket used for rollout)
- Optional covariates: `country`, `platform`, `account_age_days`, `prior_7d_time_spent`, `prior_7d_sessions`, etc.
Assume exposure was **not purely random** (e.g., phased rollout, targeting, or user self-selection), so confounding is a concern.
## Tasks
1. Define the causal question precisely (estimand) and propose **primary, diagnostic, and guardrail metrics**.
2. Propose an analysis approach to estimate the causal effect of `exposed` on `time_spent_min`.
- If you choose **Difference-in-Differences (DiD)**, specify the control group, the model, and how you would validate assumptions.
3. List major confounders/biases you’d worry about and how you would mitigate them.
4. Describe robustness checks and how you would communicate results (including uncertainty) to stakeholders.
Quick Answer: This question evaluates causal inference and observational analytics skills—specifically defining an estimand, designing primary, diagnostic, and guardrail metrics, assessing confounding and biases, and communicating uncertainty—within the Analytics & Experimentation domain for a Data Scientist role.