Explain Random Forest randomness and implications

Q: Explain Random Forest randomness and implications

This question evaluates a candidate's understanding of Random Forest ensemble mechanics—sources of randomness, their effects on bias and variance, hyperparameter impacts, evaluation choices (OOB vs cross-validation), feature-importance bias, and class-imbalance strategies—within the Machine Learning domain for binary classification.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Random Forest — Rigor and Practical Choices

Context: You are building a binary classifier with a Random Forest. The dataset has 100,000 rows, 100 features, and a 5% positive rate. Answer the following:

Sources of Randomness

Enumerate the sources of randomness in Random Forests (e.g., bootstrap sampling, feature subsampling at each split, random tie-breaking, randomized split points). For each, explain its typical effect on bias and variance.

Hyperparameters for the Given Dataset

Propose reasonable values for n_estimators, max_depth, and max_features for the dataset above.
Justify how max_features controls correlation among trees and the bias–variance trade-off.

OOB Error vs 5-Fold Cross-Validation

Compare out-of-bag (OOB) error with 5-fold cross-validation (CV).
When can they disagree and why?

Feature Importance Bias

Explain why impurity-based importances are biased toward continuous or high-cardinality features.
Propose a corrected approach (e.g., permutation importance with stratified shuffles and repeated runs) and justify your design.

Class Imbalance Strategies

Outline strategies for class imbalance (e.g., class_weight, threshold moving, balanced subsampling).
Discuss consequences for probability calibration and decision thresholds.

Explain Random Forest randomness and implications

Random Forest — Rigor and Practical Choices

Solution

Comments (0)

Explain Random Forest randomness and implications

Overview

Random Forest — Rigor and Practical Choices

Solution

Comments (0)