PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Amazon

Apply Double ML with text-address features

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's proficiency in causal inference and Double Machine Learning for estimating average treatment effects from observational data, including representation and validation of address-derived text features, nuisance estimation, overlap diagnostics, sensitivity analysis, subgroup effect reporting, and geographic privacy/fairness concerns; it is categorized under Machine Learning and applied causal inference. It is commonly asked to gauge both conceptual understanding of orthogonalization, sample-splitting and identification assumptions and practical application skills in selecting and validating text/geospatial features, performing overlap/positivity checks, conducting sensitivity analyses, and controlling for multiple comparisons, representing a mix of conceptual understanding and practical implementation.

  • hard
  • Amazon
  • Machine Learning
  • Data Scientist

Apply Double ML with text-address features

Company: Amazon

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: HR Screen

Estimate the ATE of receiving a first reminder on CSAT using Double Machine Learning (DML), incorporating text from user addresses. Specify: 1) Outcome Y, treatment T, and feature set X, including how you represent addresses (e.g., geocoding + neighborhood attributes, and an address text embedding); 2) The orthogonalized moment you will use and how you implement sample-splitting and K-fold cross-fitting; 3) Choices of nuisance learners for E[Y|X] and E[T|X] (e.g., gradient boosting for Y, calibrated logistic for T), hyperparameters, and how you prevent leakage from post-treatment variables; 4) Diagnostics for overlap/positivity and how you would trim or reweight; 5) How you would test sensitivity to unobserved confounding (e.g., Oster δ or partial R²), and report subgroup effects (device, channel) while controlling FDR; 6) How you would validate text features (ablation tests, SHAP consistency across folds) and mitigate geographic privacy/fairness risks (e.g., excluding protected proxies, coarse geohashes).

Quick Answer: This question evaluates a candidate's proficiency in causal inference and Double Machine Learning for estimating average treatment effects from observational data, including representation and validation of address-derived text features, nuisance estimation, overlap diagnostics, sensitivity analysis, subgroup effect reporting, and geographic privacy/fairness concerns; it is categorized under Machine Learning and applied causal inference. It is commonly asked to gauge both conceptual understanding of orthogonalization, sample-splitting and identification assumptions and practical application skills in selecting and validating text/geospatial features, performing overlap/positivity checks, conducting sensitivity analyses, and controlling for multiple comparisons, representing a mix of conceptual understanding and practical implementation.

Related Interview Questions

  • Explain Core ML Interview Concepts - Amazon (hard)
  • Evaluate NLP Classification Models - Amazon (easy)
  • Explain overfitting, regularization, and LLM techniques - Amazon (medium)
  • Explain NLP/RL concepts used in LLM agents - Amazon (hard)
  • Design and evaluate a RAG system - Amazon (easy)
Amazon logo
Amazon
Oct 13, 2025, 9:49 PM
Data Scientist
HR Screen
Machine Learning
3
0

Estimate the ATE of a First Reminder on CSAT via Double Machine Learning (DML)

Context

You have observational data on customer satisfaction (CSAT) surveys. Some customers received a first reminder to complete the survey; others did not. You want the Average Treatment Effect (ATE) of receiving the first reminder on CSAT, using Double Machine Learning (DML). Addresses are available as text and can be used to construct geospatial features and embeddings.

Assume: (a) treatment assignment is not randomized but depends on observed pre-treatment covariates; (b) CSAT is measured on a numeric scale (e.g., 1–5) for completed surveys; (c) all features used are measured or defined prior to the reminder decision.

Requirements

  1. Define outcome Y, treatment T, and feature set X, including how address text is represented (e.g., geocoding + neighborhood attributes, an address text embedding).
  2. Write the orthogonalized moment you will use for DML and describe sample-splitting and K-fold cross-fitting.
  3. Choose nuisance learners for E[Y|X] and E[T|X], with hyperparameters, and explain how you prevent leakage from post-treatment variables.
  4. Provide diagnostics for overlap/positivity and how you would trim or reweight.
  5. Describe sensitivity analyses for unobserved confounding (e.g., Oster δ or partial R²) and how you will report subgroup effects (device, channel) while controlling FDR.
  6. Explain how you will validate text features (ablation tests, SHAP stability across folds) and mitigate geographic privacy/fairness risks (e.g., excluding protected proxies, coarse geohashes).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Amazon•More Data Scientist•Amazon Data Scientist•Amazon Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.