PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/Uber

Select the better $5 promo-targeting model

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in policy evaluation, off-policy estimation, probability calibration, leakage detection, handling delayed labels, and production monitoring for budget-constrained coupon targeting, and it belongs to the Machine Learning and Data Science domain.

  • hard
  • Uber
  • Machine Learning
  • Data Scientist

Select the better $5 promo-targeting model

Company: Uber

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

You have two user-scoring models for a $5 coupon: M0 (current) and M1 (new), each outputting p_i = P(redeem | user i). You may send at most B promos/day. Define a decision policy that sends to the top‑K users per day so expected spend is ≤ B. a) Specify a success metric aligned to profit (e.g., expected incremental profit per user = uplift × expected GMV − $5) and justify why AUC/accuracy can be misleading for targeting. b) Using a historical randomized dataset with columns {user_id, features, assigned_treatment ∈ {coupon, control}, outcome redeem ∈ {0,1}, gmv, timestamp}, derive off‑policy estimators to compare the two policies induced by M0 and M1: inverse propensity scoring (IPS), self‑normalized IPS (SNIPS), and doubly robust (DR). Write the formulas, state assumptions for unbiasedness, and discuss variance trade‑offs and cross‑fitting. c) Describe how you would calibrate probabilities (e.g., isotonic/Platt), set a daily threshold to respect budget B under drift, and directly optimize expected profit subject to guardrails (opt‑out rate, complaint rate). d) List three concrete leakage risks (e.g., using features that reflect prior coupon exposure, post‑treatment variables, or future-engagement proxies) and how you would detect/prevent them. e) Explain handling delayed redemption labels and per‑user redemption caps in both training and evaluation to avoid bias. f) Outline a monitoring plan for non‑stationarity and cold‑start users, including shadow deployment and canarying.

Quick Answer: This question evaluates proficiency in policy evaluation, off-policy estimation, probability calibration, leakage detection, handling delayed labels, and production monitoring for budget-constrained coupon targeting, and it belongs to the Machine Learning and Data Science domain.

Related Interview Questions

  • Evaluate Promotions for Uber Eats Users - Uber (medium)
  • Implement Streaming Clustering for Numbers - Uber
  • Build cold-start restaurant ratings - Uber (medium)
  • Implement CLIP Contrastive Loss - Uber (medium)
  • Predict driver acceptance - Uber (medium)
Uber logo
Uber
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
6
0

Coupon Targeting Under a Daily Budget: Policy, OPE, Calibration, and Monitoring

Context

  • You have two user-scoring models for a $5 coupon: M0 (current) and M1 (new). Each outputs a score p_i that should be interpreted as P(redeem | send, user i).
  • You may send at most K promos per day and must ensure expected coupon spend ≤ B dollars/day. The coupon costs $5 only when redeemed.
  • Historical data comes from a randomized experiment (logging policy) with columns: {user_id, features X, assigned_treatment W ∈ {1=coupon,0=control}, outcome redeem Y ∈ {0,1}, gmv G, timestamp t}.

Tasks (a) Define a success metric aligned to profit and explain why AUC/accuracy can be misleading for targeting under a budget.

(b) Using the randomized dataset, derive off-policy estimators to compare the two deterministic policies induced by M0 and M1 (each implements a daily top-K rule under budget B): inverse propensity scoring (IPS), self-normalized IPS (SNIPS), and doubly robust (DR). Write formulas, state assumptions for unbiasedness, and discuss variance trade-offs and cross-fitting.

(c) Describe how to calibrate probabilities (e.g., isotonic/Platt), set a daily threshold to respect budget B under drift, and directly optimize expected profit subject to guardrails (e.g., opt-out rate, complaint rate).

(d) List three concrete leakage risks (e.g., features reflecting prior coupon exposure, post-treatment variables, future-engagement proxies) and how to detect/prevent them.

(e) Explain handling delayed redemption labels and per-user redemption caps in both training and evaluation to avoid bias.

(f) Outline a monitoring plan for non-stationarity and cold-start users, including shadow deployment and canarying.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Uber•More Data Scientist•Uber Data Scientist•Uber Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.