Before global launch, you want to predict which users or products would benefit most from the 'More like this' button so you can stage rollout.
Design an end-to-end modeling approach using only pre-launch data from the interactions/products schema and any new logging you can add at launch:
-
Labeling: propose a proxy label available pre-launch (e.g., user propensity to explore similar items via existing flows) and a post-launch true label (uplift in exploration rate or interaction_count per user). Explain how you will avoid target leakage, especially from features derived too close to the labeling window.
-
Features: list concrete user, product, and interaction features you’ll engineer (e.g., recency/frequency of category interactions, dwell-time proxies via interaction_count sequences, country, seller diversity). Include cross-features (user×category affinity) and cold-start strategies.
-
Models: choose two candidate model families (e.g., calibrated gradient-boosted trees and a sparse logistic regression). Explain when a deep model would be justified (sample size, feature types) and how you’d ensure calibration for decisioning.
-
Evaluation: define offline splits (time-based, user-level holdout), metrics (AUC/PR for ranking; Qini/uplift AUC if modeling treatment uplift), and how you’ll run prospective shadow testing once telemetry exists.
-
Decisioning: specify a thresholding or budgeted targeting policy and show how you’d simulate business impact under deployment constraints. Address fairness across countries and guard against negative spillovers (e.g., reduced variety). Include a plan for periodic retraining and drift monitoring.