Build and evaluate an order prediction model

Q: Build and evaluate an order prediction model

This question evaluates a data scientist's competency in building and evaluating binary classification models with temporal constraints and operational requirements, covering leakage-safe temporal validation, feature engineering groups, class imbalance handling, threshold selection for business precision/recall targets, metric reasoning under prevalence and label-window shifts, and deployment drift monitoring in the Machine Learning domain. It is commonly asked to assess the candidate's ability to design robust, leakage-free evaluation pipelines and translate business requirements into measurable model thresholds and monitoring plans, testing both conceptual understanding of evaluation and data-shift concepts and practical application in model validation and deployment.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Loading...

Predict 7-Day Order Completion from First Session

You are building a binary classifier to predict whether a guest will complete an order within 7 days of their first session in an evaluation window. The index time is the guest's first session, and the label is whether an order occurs within 7 days of that session. Assume you have complete event logs and can construct features up to (but not after) the index time.

Tasks

Data and validation design
- Propose a leakage-safe temporal split using: train on 2025-06–2025-08 data, validate on 2025-08-24–2025-09-01. Ensure labels are fully matured for the 7-day horizon and that there is no time leakage.
- List feature groups you would build (recency/frequency, product interest, session quality, source/device) with examples.
- Describe how you would handle class imbalance (e.g., calibrated probabilities, focal loss or class weights) and which metrics you would optimize.
Thresholding for business goals
- The business requires precision ≥ 0.80 while maximizing recall. Describe how you would pick a probability threshold using PR curves and calibration.
Confusion-matrix math
- Assume 10,000 guests with 600 positives. At threshold t = 0.70, TP = 360, FP = 140, FN = 240, TN = 9,260. Compute precision, recall, specificity, and F1.
- If t increases to 0.85 and counts become TP = 300, FP = 60, FN = 300, TN = 9,340, recompute the same metrics and explain the monotonic changes.
Prevalence shift
- Using TPR = 360/600 and FPR = 140/9,400 from t = 0.70, if prevalence doubles to π = 10% but class-conditional score distributions are unchanged and you keep the same threshold, quantify precision via Precision = (TPR·π) / (TPR·π + FPR·(1 − π)) and state whether recall changes. Interpret the result.
Label-window change
- Re-evaluate the same predictions against a 30-day label where 200 additional guests become positive between days 8–30; 50 of those were predicted positive and 150 predicted negative. Update counts and recompute precision and recall; explain why one can increase while the other decreases.
Online deployment
- Outline a drift/guardrail plan (calibration monitoring, SRM-like traffic checks, canary with CUPED-adjusted lift) and how you’d adapt thresholds by segment without leaking information.

Build and evaluate an order prediction model

Predict 7-Day Order Completion from First Session

Tasks

Solution

Comments (0)

Build and evaluate an order prediction model

Overview

Predict 7-Day Order Completion from First Session

Tasks

Solution

Comments (0)