Predict and act on contract renewal risk
Company: Google
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
You’re asked to predict contract renewal for enterprise customers of a video-conferencing product after a spike in call disconnects. Design a modeling approach: define the label (renewal within the next contract period) and prediction horizon; specify features such as disconnection rate per 1k minutes, percent of meetings affected, time-to-resolution, SLA breaches, support ticket volume, NPS/CSAT, active seats, usage intensity, industry, account tenure, and price/discounts; discuss when logistic regression is preferable to gradient-boosted trees or deep models (interpretability, small-N, linear signal, sparse features, latency constraints) and when complex models are justified; prevent leakage (e.g., features created post-renewal), address class imbalance, consider survival analysis vs binary classification for censored data, and optionally add monotonic constraints; outline evaluation beyond AUROC (PR-AUC for rare churn, calibration curves/Brier score, cohort- and time-split backtests, stability under distribution shift), and compare rank-based vs calibrated thresholding; translate scores into actions with cost–benefit thresholds, expected ROI, and capacity constraints; provide a minimal viable feature set and justify it.
Quick Answer: This question evaluates competency in applied machine learning for enterprise churn prediction, testing skills in problem framing, feature engineering, model selection, data hygiene, evaluation metrics, and decisioning within the Machine Learning domain.