This question evaluates a data scientist's competency in building and evaluating binary classification models with temporal constraints and operational requirements, covering leakage-safe temporal validation, feature engineering groups, class imbalance handling, threshold selection for business precision/recall targets, metric reasoning under prevalence and label-window shifts, and deployment drift monitoring in the Machine Learning domain. It is commonly asked to assess the candidate's ability to design robust, leakage-free evaluation pipelines and translate business requirements into measurable model thresholds and monitoring plans, testing both conceptual understanding of evaluation and data-shift concepts and practical application in model validation and deployment.
You are building a binary classifier to predict whether a guest will complete an order within 7 days of their first session in an evaluation window. The index time is the guest's first session, and the label is whether an order occurs within 7 days of that session. Assume you have complete event logs and can construct features up to (but not after) the index time.
Login required