Expedia Hotel-Ranking Model: Evaluation, Metrics, Diagnostics, Rollout, and KPI Alignment
Context: You are building a learning-to-rank (LTR) model to order hotel search results for Expedia. The goal is to maximize client value (e.g., bookings, gross merchandise value [GMV], or margin) while ensuring rigorous offline evaluation, attribution fairness under position bias, and safe deployment.
Answer the following:
(a) Offline evaluation plan
-
Prevent leakage through time-based splits and user/session-level grouping.
-
Handle position bias using propensity-weighted (IPS/SNIPS/DR) metrics or counterfactual LTR.
-
Report probability calibration of conversion predictions (reliability curves, ECE).
(b) Ranking metrics
-
Specify which ranking metrics you will report (e.g., NDCG@10 with revenue/margin weights, ERR), including formulas and why they reflect client value.
(c) Diagnostics for key drivers
-
Outline diagnostics to identify key drivers without leaking target information (e.g., SHAP with proper background, permutation checks, stability across folds).
(d) Rollout and monitoring
-
Define shadow mode, canary, guardrail alarms, drift detection (data and concept), late-arriving data handling, and an automatic rollback policy with thresholds.
(e) Surrogate objective and client KPI
-
Describe how you would verify that optimizing a surrogate objective still improves the client KPI, and what you would do if the offline–online relationship breaks.