Select the better $5 promo-targeting model

Q: Select the better $5 promo-targeting model

This question evaluates proficiency in policy evaluation, off-policy estimation, probability calibration, leakage detection, handling delayed labels, and production monitoring for budget-constrained coupon targeting, and it belongs to the Machine Learning and Data Science domain.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Coupon Targeting Under a Daily Budget: Policy, OPE, Calibration, and Monitoring

Context

You have two user-scoring models for a $5 coupon: M0 (current) and M1 (new). Each outputs a score p_i that should be interpreted as P(redeem | send, user i).
You may send at most K promos per day and must ensure expected coupon spend ≤ B dollars/day. The coupon costs $5 only when redeemed.
Historical data comes from a randomized experiment (logging policy) with columns: {user_id, features X, assigned_treatment W ∈ {1=coupon,0=control}, outcome redeem Y ∈ {0,1}, gmv G, timestamp t}.

Tasks (a) Define a success metric aligned to profit and explain why AUC/accuracy can be misleading for targeting under a budget.

(b) Using the randomized dataset, derive off-policy estimators to compare the two deterministic policies induced by M0 and M1 (each implements a daily top-K rule under budget B): inverse propensity scoring (IPS), self-normalized IPS (SNIPS), and doubly robust (DR). Write formulas, state assumptions for unbiasedness, and discuss variance trade-offs and cross-fitting.

(c) Describe how to calibrate probabilities (e.g., isotonic/Platt), set a daily threshold to respect budget B under drift, and directly optimize expected profit subject to guardrails (e.g., opt-out rate, complaint rate).

(d) List three concrete leakage risks (e.g., features reflecting prior coupon exposure, post-treatment variables, future-engagement proxies) and how to detect/prevent them.

(e) Explain handling delayed redemption labels and per-user redemption caps in both training and evaluation to avoid bias.

(f) Outline a monitoring plan for non-stationarity and cold-start users, including shadow deployment and canarying.

Select the better $5 promo-targeting model

Coupon Targeting Under a Daily Budget: Policy, OPE, Calibration, and Monitoring

Solution

Comments (0)

Select the better $5 promo-targeting model

Overview

Coupon Targeting Under a Daily Budget: Policy, OPE, Calibration, and Monitoring

Solution

Comments (0)