The question evaluates a candidate's understanding of CART decision tree mechanics, split criteria and surrogate splits for missing values, hyperparameter tuning and pruning, overfitting diagnostics, preprocessing for large feature sets and high‑cardinality categoricals, and criteria for choosing ensemble methods within the Machine Learning domain.
Context: You built a CART-style decision tree for a take‑home ML project. Answer concisely with formulas, procedures, and practical guidance.
Explain how a CART tree selects splits under:
Provide a defensible procedure to select these hyperparameters:
List at least three diagnostics and what patterns flag overfitting in trees. Examples: train–CV gap, learning curves, permutation importance stability, calibration curves.
Specify:
Name at least three data/target conditions and discuss trade‑offs (variance, interpretability, latency, OOB vs. CV, calibration). Describe how to compare models fairly (data splits, nested CV, fixed preprocessing, identical evaluation protocol).
Login required