PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Point72

Explain and tune decision trees robustly

Last updated: Mar 29, 2026

Quick Overview

The question evaluates a candidate's understanding of CART decision tree mechanics, split criteria and surrogate splits for missing values, hyperparameter tuning and pruning, overfitting diagnostics, preprocessing for large feature sets and high‑cardinality categoricals, and criteria for choosing ensemble methods within the Machine Learning domain.

  • hard
  • Point72
  • Machine Learning
  • Data Scientist

Explain and tune decision trees robustly

Company: Point72

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Take-home Project

You built a decision tree in an internship. Answer the following, with crisp formulas and procedures: 1) Explain how a CART decision tree selects splits for classification vs. regression (impurity/variance criteria), including exact formulas for Gini, entropy, and MSE, and how surrogate splits work when features have missing values. 2) Give a defensible procedure to choose max_depth and min_samples_split: define a cross‑validation plan, early‑stopping/pruning (cost‑complexity α path), and the metric you would optimize under severe class imbalance (justify PR‑AUC vs. ROC‑AUC vs. F1). Include how you would pick α from the CCP path without leakage. 3) Overfitting checks: specify at least three diagnostics (e.g., cross‑validated gap vs. training, learning curves, permutation importance stability, calibration curves). What patterns flag overfitting for trees specifically? 4) With ~500k rows, ~300 features including high‑cardinality categoricals and sparse indicators, propose a preprocessing + modeling plan using a single decision tree: encoding choice, handling rare categories, monotonic constraints (if any), feature binning, and computational cost. Provide concrete hyperparameter ranges and expected training time order‑of‑magnitude. 5) If you could revisit the project, when would a random forest or a gradient‑boosted tree (e.g., XGBoost/LightGBM) outperform a single tree on this data? Name at least three data/target conditions and the trade‑offs (variance, interpretability, latency, OOB vs. CV, calibration). How would you compare models fairly (data splits, nested CV, fixed preprocessing, and identical evaluation protocol)?

Quick Answer: The question evaluates a candidate's understanding of CART decision tree mechanics, split criteria and surrogate splits for missing values, hyperparameter tuning and pruning, overfitting diagnostics, preprocessing for large feature sets and high‑cardinality categoricals, and criteria for choosing ensemble methods within the Machine Learning domain.

Related Interview Questions

  • Design Features for Residual Volatility - Point72 (medium)
  • Explain Transformer Encoder and Decoder Behavior - Point72 (medium)
  • Compute Gaussian Probability and Regression Coefficients - Point72 (medium)
  • Design a News-Filtering Prompt - Point72 (medium)
  • How would you explain PCA and SHAP? - Point72 (hard)
Point72 logo
Point72
Oct 13, 2025, 9:49 PM
Data Scientist
Take-home Project
Machine Learning
3
0

Decision Trees: Splitting, Tuning, Overfitting, and When to Use Ensembles

Context: You built a CART-style decision tree for a take‑home ML project. Answer concisely with formulas, procedures, and practical guidance.

1) CART Splits: Classification vs. Regression, and Surrogate Splits for Missing Values

Explain how a CART tree selects splits under:

  • Classification: impurity criteria (Gini, entropy)
  • Regression: variance/MSE Include exact formulas for Gini, entropy, MSE, the split selection rule, and how surrogate splits work when features have missing values.

2) Choosing max_depth and min_samples_split

Provide a defensible procedure to select these hyperparameters:

  • Cross‑validation plan (fold type and repetitions)
  • Early‑stopping/pruning using cost‑complexity pruning (α path)
  • Metric to optimize under severe class imbalance, and why (PR‑AUC vs. ROC‑AUC vs. F1)
  • How to pick α from the CCP path without leakage

3) Overfitting Checks

List at least three diagnostics and what patterns flag overfitting in trees. Examples: train–CV gap, learning curves, permutation importance stability, calibration curves.

4) Preprocessing + Modeling Plan (Single Tree) for ~500k rows, ~300 features, including high‑cardinality categoricals and sparse indicators

Specify:

  • Encoding choice and handling rare categories
  • Handling missing values
  • Monotonic constraints (if any)
  • Feature binning
  • Computational cost and concrete hyperparameter ranges; expected training time order‑of‑magnitude

5) When Would Random Forests or Gradient‑Boosted Trees Outperform a Single Tree?

Name at least three data/target conditions and discuss trade‑offs (variance, interpretability, latency, OOB vs. CV, calibration). Describe how to compare models fairly (data splits, nested CV, fixed preprocessing, identical evaluation protocol).

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Point72•More Data Scientist•Point72 Data Scientist•Point72 Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.