Build and evaluate a conversion prediction model

Q: Build and evaluate a conversion prediction model

This question evaluates proficiency in predictive modeling, feature engineering with target-leakage control, time-aware validation, uplift and treatment-effect estimation, calibration and uncertainty quantification, and end-to-end deployment considerations within the Machine Learning domain.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Predicting 7-Day Purchase After Email Send

Context

You are given a CSV where each row is a user–email send (or scheduled send/control), with columns:

user_id, send_ts, treatment_flag (1=sent, 0=control/holdout)
opened, clicked, purchased_within_7d (label)
user_region, device_type, tenure_days
prior_sessions_28d, prior_purchases_180d, avg_cart_value_180d, categories_viewed_28d
email_personalization_score, deliverability_score
plus other anonymized features

Goal: Build a model that, at send time, predicts whether a user will purchase within 7 days of the email send, then decide whom to send to in order to maximize incremental revenue.

Assume purchased_within_7d is defined as a purchase occurring within [send_ts, send_ts + 7d). For control rows (treatment_flag=0), the window is anchored on the scheduled send_ts.

Tasks

EDA and Leakage Control

Identify potential target leakage and ensure all features are computable at send_ts. Avoid using post-send behaviors (e.g., opened, clicked) unless modeling a multi-stage chain with predicted intermediates.
Explore class imbalance, missingness, outliers, and high-cardinality categoricals. Propose concrete data quality checks.

Modeling

Train two models to predict purchased_within_7d at send time: a baseline logistic regression and a gradient-boosted tree.
Use time-based splits: train = 2025-06, valid = 2025-07, test = 2025-08.
Within the train month, use time-aware cross-validation and tune hyperparameters. Apply monotonic constraints and/or regularization where appropriate.

Evaluation

Report ROC AUC, PR AUC, calibration (reliability curve, Brier score), and incremental lift at the top 10% scored users versus control.
Provide 95% confidence intervals via bootstrap and assess stability by user_region and device_type.

Deployment

Choose a decision rule that maximizes expected incremental revenue given an email cost of $0.003 and an estimated treatment effect from a calibration experiment.
Describe monitoring (data drift, performance drift, alerting), retraining cadence, and next steps (feature engineering, causal uplift modeling, de-biasing via IPS/DR estimators).

Build and evaluate a conversion prediction model

Predicting 7-Day Purchase After Email Send

Context

Tasks

Solution

Comments (0)

Build and evaluate a conversion prediction model

Overview

Predicting 7-Day Purchase After Email Send

Context

Tasks

Solution

Comments (0)