You have historical campaign logs with randomized holdouts from last season. Design a treatment effect modeling approach to decide whom to contact by SMS or Email for the upcoming flu-shot campaign. Data available - Features: demographics, past visits, prior vaccinations, engagement (opens/clicks), distance to store, appointment history. - Labels: y = vaccinated within 30 days; treatments T ∈ {control, SMS, Email}; randomized assignment with known probabilities; exposure indicators (delivered/opened). - Costs: c_SMS=$0.02, c_Email=$0.001; budget allows contacting at most 40% of eligibles. Tasks 1) Modeling - Choose and justify an approach: separate response models + two-model uplift, direct uplift (e.g., meta-learners: T-learner/S-learner/DR-learner), or multiclass treatment modeling. Address leakage (post-treatment features), class imbalance, and calibration. 2) Evaluation - Define offline evaluation: uplift/Qini curves, AUUC; compute incremental ROI considering channel costs; use policy evaluation with inverse propensity weighting (IPW) or doubly robust estimators. 3) Policy - Given a budget contacting up to 40% of eligibles, describe how to rank customers by predicted incremental effect and choose the channel per customer (e.g., argmax over channel-specific uplift minus cost). Explain guardrails (do-not-contact, fairness across age/state). 4) Online validation - Propose a gated rollout test comparing model-based targeting vs uniform random targeting. Define success metrics and stopping rules. 5) Diagnostics - Show how you would detect harmful persuasion (negative uplift) segments and handle them in targeting.

This question evaluates a data scientist's ability to perform treatment-effect and uplift modeling for multi-armed marketing interventions, encompassing causal inference with randomized holdouts and propensity scoring, cost-sensitive channel selection, calibration and class-imbalance handling, policy evaluation, and diagnostic/operational safeguards within the Machine Learning domain. It is commonly asked because it probes practical application of causal ML and decision-policy design—requiring model selection, offline and online evaluation under budget and fairness constraints—and is primarily a practical application task with important conceptual causal-inference elements.

How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Technical Screen rounds at CVS Health.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at CVS Health during technical interviews.

Build an uplift model for targeting | CVS Health Interview Question

Flu-shot Campaign: Treatment-Effect Modeling and Targeting Policy

You have historical campaign logs from last season that include randomized holdouts. You must design a treatment-effect modeling and targeting approach to decide whether to contact a customer by SMS or Email for the upcoming flu-shot campaign.

Data Available

Features (pre-treatment only for modeling): demographics, past visits, prior vaccinations, engagement history (prior opens/clicks), distance to store, appointment history.
Labels: y = 1 if vaccinated within 30 days; 0 otherwise.
Treatments: T ∈ {control, SMS, Email}, assigned at random with known propensities p_t.
Exposure indicators (post-assignment): delivery status, opened. Use for diagnostics/mediation only (avoid leakage in ITT models).
Costs: c_SMS = $0.02, c_Email =$ 0.001.
Operational constraint: may contact at most 40% of eligibles.

Tasks

Modeling
- Choose and justify an approach among: separate response models + two-model uplift, direct uplift/meta-learners (T-/S-/DR-learner), or multiclass treatment modeling.
- Address leakage (post-treatment features), class imbalance, and probability calibration.
Evaluation
- Define offline evaluation: uplift/Qini curves and AUUC; compute incremental ROI including channel costs.
- Use policy evaluation with inverse propensity weighting (IPW) or doubly-robust (DR) estimators.
Policy
- With the 40% contact budget, describe how to rank customers by predicted incremental effect and choose the channel per customer (e.g., argmax of channel-specific uplift minus cost scaled by value).
- Explain guardrails (do-not-contact lists, fairness across age/state, frequency caps).
Online Validation
- Propose a gated rollout comparing model-based targeting vs uniform random targeting (both constrained to 40% contact rate).
- Define success metrics and stopping rules.
Diagnostics
- Describe how to detect and mitigate harmful persuasion (negative uplift) segments, and how you would handle them in targeting.

Flu-shot Campaign: Treatment-Effect Modeling and Targeting Policy

Data Available

Features (pre-treatment only for modeling): demographics, past visits, prior vaccinations, engagement history (prior opens/clicks), distance to store, appointment history.
Labels: y = 1 if vaccinated within 30 days; 0 otherwise.
Treatments: T ∈ {control, SMS, Email}, assigned at random with known propensities p_t.
Exposure indicators (post-assignment): delivery status, opened. Use for diagnostics/mediation only (avoid leakage in ITT models).
Costs: c_SMS = $0.02, c_Email =$ 0.001.
Operational constraint: may contact at most 40% of eligibles.

Tasks

Modeling
- Choose and justify an approach among: separate response models + two-model uplift, direct uplift/meta-learners (T-/S-/DR-learner), or multiclass treatment modeling.
- Address leakage (post-treatment features), class imbalance, and probability calibration.
Evaluation
- Define offline evaluation: uplift/Qini curves and AUUC; compute incremental ROI including channel costs.
- Use policy evaluation with inverse propensity weighting (IPW) or doubly-robust (DR) estimators.
Policy
- With the 40% contact budget, describe how to rank customers by predicted incremental effect and choose the channel per customer (e.g., argmax of channel-specific uplift minus cost scaled by value).
- Explain guardrails (do-not-contact lists, fairness across age/state, frequency caps).
Online Validation
- Propose a gated rollout comparing model-based targeting vs uniform random targeting (both constrained to 40% contact rate).
- Define success metrics and stopping rules.
Diagnostics
- Describe how to detect and mitigate harmful persuasion (negative uplift) segments, and how you would handle them in targeting.

Build an uplift model for targeting

Quick Overview

Build an uplift model for targeting

Flu-shot Campaign: Treatment-Effect Modeling and Targeting Policy

Data Available

Tasks

Write your answer

Build an uplift model for targeting

Quick Overview

Build an uplift model for targeting

Flu-shot Campaign: Treatment-Effect Modeling and Targeting Policy

Data Available

Tasks

Write your answer