How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

What difficulty level is this interview question?

This is a hard difficulty Machine Learning question, commonly asked during Take-home Project rounds at Stripe.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Stripe during technical interviews.

Design a model for subscription adoption prediction

Q: Design a model for subscription adoption prediction

This question evaluates a candidate's ability to design and productionize a supervised machine-learning pipeline for predicting subscription adoption, testing competencies in labeling strategy, leakage identification and prevention, feature engineering from transactional and merchant metadata, model selection and calibration, evaluation with time-based validation, and post-deployment monitoring. It is commonly asked in the Machine Learning domain for Data Scientist roles because it assesses practical application and production-ready design along with conceptual understanding of time-based splits, class imbalance, performance metrics, and drift detection.

Predicting 60-Day Adoption of Subscription by Non-Subscription Merchants

Context

You need to predict which merchants who are not currently using the Subscription product will adopt it within the next 60 days. For the live run, only data available up to 2025-07-03 may be used to predict adoption by 2025-09-01.

Assume you have: transaction/event logs (charges, refunds, disputes, payouts), merchant metadata (signup date, vertical, country), and identifiers like customer_id and card_fingerprint. Assume an event that uniquely indicates Subscription adoption (e.g., first Subscription API event or first Subscription invoice) is available with a timestamp.

Task

Design a production-ready classification approach and answer concisely:

(a) Labeling: Precisely define positives/negatives and the observation and outcome windows; handle merchants already using Subscription and cold-start merchants.

(b) Leakage: List at least five concrete leakage risks specific to this data and how to prevent them via time-based feature windows and proper splits.

(c) Features: Propose 15–25 high-signal, computable features from transactions (recency/frequency/monetary, 28–35 day repeat patterns, customer concentration, card_fingerprint diversity, weekend share, chargeback/refund rates, growth rates) and from merchant metadata (age, vertical, geo).

(d) Modeling: Choose two models (e.g., regularized logistic vs. gradient-boosted trees); discuss class imbalance handling (weights vs. downsampling), calibration, and interpretability for a sales handoff.

(e) Evaluation: Specify time-based cross-validation, primary metrics (PR-AUC, precision@K, recall@K), and how you would select a threshold to deliver a list of 1,000 merchants with expected precision ≥ 0.60.

(f) Monitoring: Define post-deployment drift and performance checks (data drift on feature distributions, label drift, calibration drift) and how to retrain without contaminating future labels.

Predicting 60-Day Adoption of Subscription by Non-Subscription Merchants

Context

Task

Design a production-ready classification approach and answer concisely:

(a) Labeling: Precisely define positives/negatives and the observation and outcome windows; handle merchants already using Subscription and cold-start merchants.

(b) Leakage: List at least five concrete leakage risks specific to this data and how to prevent them via time-based feature windows and proper splits.

(f) Monitoring: Define post-deployment drift and performance checks (data drift on feature distributions, label drift, calibration drift) and how to retrain without contaminating future labels.

Design a model for subscription adoption prediction

Quick Overview

Predicting 60-Day Adoption of Subscription by Non-Subscription Merchants

Context

Task

Solution

Submit Your Answer

Design a model for subscription adoption prediction

Quick Overview

Predicting 60-Day Adoption of Subscription by Non-Subscription Merchants

Context

Task

Solution

Submit Your Answer