PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/Amazon

Design causal study for reminder impact

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in observational causal inference and program evaluation, covering concepts such as staggered rollouts, treatment definition, difference-in-differences diagnostics, matching/weighting, negative controls, interference handling, and power/MDE planning.

  • hard
  • Amazon
  • Analytics & Experimentation
  • Data Scientist

Design causal study for reminder impact

Company: Amazon

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: HR Screen

You cannot randomize who receives medication-subscription reminders; product launches were staggered by market and channel (push/email/SMS). Outcome is user experience (CSAT 1–5) and 4-week retention; adoption may spill over within households. Design an observational causal study. Answer precisely: 1) State your primary identification strategy and model (e.g., staggered-adoption DID with event study) and write the regression you would estimate, including fixed effects, time trends/seasonality, and how you cluster standard errors. 2) Define treatment and risk sets to avoid immortal-time bias, and explain how you handle not-yet-treated users. 3) Specify pre-trend diagnostics, how you would detect/mitigate treatment effect heterogeneity bias in TWFE, and which modern DID estimator (e.g., Sun–Abraham or Callaway–Sant’Anna) you would use and why. 4) Lay out a matching or weighting backup plan (e.g., PSM or overlap weighting): covariates needed, caliper/ratio, balance metrics and thresholds. 5) Propose two negative controls (an outcome and an exposure) and one falsification test. 6) Address household spillovers/interference, missing CSAT, and channel selection (users can opt out of push). 7) Provide a minimal power/MDE calculation outline with assumptions on baseline variance, intra-household correlation, and expected adoption rate.

Quick Answer: This question evaluates a data scientist's competency in observational causal inference and program evaluation, covering concepts such as staggered rollouts, treatment definition, difference-in-differences diagnostics, matching/weighting, negative controls, interference handling, and power/MDE planning.

Related Interview Questions

  • Explain why CTR rises but CVR unchanged - Amazon (medium)
  • How would you test a price increase? - Amazon (medium)
  • How to evaluate adding video ads in a game - Amazon (easy)
  • How would you analyze and test a price increase? - Amazon (easy)
  • How would you evaluate adding video ads? - Amazon (medium)
Amazon logo
Amazon
Oct 13, 2025, 9:49 PM
Data Scientist
HR Screen
Analytics & Experimentation
3
0

Observational Causal Study: Reminder Program With Staggered Market × Channel Launch

Context

You are evaluating the causal impact of medication-subscription reminders on two outcomes:

  • User experience (CSAT on a 1–5 scale)
  • 4-week retention

The reminder product launched at different times across markets and channels (push/email/SMS). You cannot randomize who receives reminders. Users can opt out of certain channels (e.g., push). Adoption may spill over within households. Design an observational causal analysis that leverages staggered rollouts.

Tasks

Answer precisely:

  1. Identification strategy and model
    • State your primary identification strategy (e.g., staggered-adoption DID with event study).
    • Write the regression(s) you would estimate, including fixed effects, time trends/seasonality, and how you will cluster standard errors.
  2. Treatment definition and risk sets
    • Define treatment and risk sets to avoid immortal-time bias.
    • Explain how you handle not-yet-treated users.
  3. Diagnostics and modern DID
    • Specify pre-trend diagnostics.
    • Explain how you would detect and mitigate treatment-effect heterogeneity bias in TWFE.
    • Choose a modern DID estimator (e.g., Sun–Abraham or Callaway–Sant’Anna) and justify.
  4. Matching/weighting backup plan
    • Propose a backup plan using matching or weighting (e.g., PSM or overlap weighting).
    • List covariates needed, caliper/ratio, and balance metrics/thresholds.
  5. Negative controls and falsification
    • Propose two negative controls (one outcome, one exposure) and one falsification test.
  6. Threats and remedies
    • Address household spillovers/interference, missing CSAT, and channel selection (users can opt out of push).
  7. Power/MDE outline
    • Provide a minimal power/MDE calculation outline with assumptions on baseline variance, intra-household correlation, and expected adoption rate.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.