PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/LinkedIn

Explain variance reduction in random forests

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of variance reduction in ensemble methods, the impact of inter-tree correlation on averaged predictors, and the bias-variance trade-off in random forests, framed within the Machine Learning domain for Data Scientist roles.

  • medium
  • LinkedIn
  • Machine Learning
  • Data Scientist

Explain variance reduction in random forests

Company: LinkedIn

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

Consider a random forest (or bagged ensemble) that predicts at a fixed input \(x\) by averaging \(B\) tree predictions: \[ \hat f_B(x) = \frac{1}{B}\sum_{b=1}^B T_b(x). \] Assume each tree prediction has the same variance \(\sigma^2\) and any pair of tree predictions has correlation \(\rho\). 1. Derive the variance of the ensemble prediction \(\hat f_B(x)\). 2. Explain how this connects to the variance formula for the average of correlated random variables. 3. Interpret the result for the cases \(\rho=0\), \(\rho=1\), and \(B \to \infty\). 4. Based on this result, explain why random forests use bagging and random feature subsampling, and discuss the bias-variance trade-off when making trees more random.

Quick Answer: This question evaluates understanding of variance reduction in ensemble methods, the impact of inter-tree correlation on averaged predictors, and the bias-variance trade-off in random forests, framed within the Machine Learning domain for Data Scientist roles.

Related Interview Questions

  • Explain Logistic Regression, Backprop, and Adam - LinkedIn (medium)
  • Answer practical ML foundations questions - LinkedIn (medium)
  • Handle imbalance, sampling, and overfitting - LinkedIn (easy)
  • Handle imbalance, validate samples, and avoid overfitting - LinkedIn (easy)
  • Explain activations, losses, and Adam - LinkedIn (medium)
LinkedIn logo
LinkedIn
Feb 19, 2026, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
7
0
Loading...

Consider a random forest (or bagged ensemble) that predicts at a fixed input xxx by averaging BBB tree predictions:

f^B(x)=1B∑b=1BTb(x).\hat f_B(x) = \frac{1}{B}\sum_{b=1}^B T_b(x).f^​B​(x)=B1​∑b=1B​Tb​(x).

Assume each tree prediction has the same variance σ2\sigma^2σ2 and any pair of tree predictions has correlation ρ\rhoρ.

  1. Derive the variance of the ensemble prediction f^B(x)\hat f_B(x)f^​B​(x) .
  2. Explain how this connects to the variance formula for the average of correlated random variables.
  3. Interpret the result for the cases ρ=0\rho=0ρ=0 , ρ=1\rho=1ρ=1 , and B→∞B \to \inftyB→∞ .
  4. Based on this result, explain why random forests use bagging and random feature subsampling, and discuss the bias-variance trade-off when making trees more random.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More LinkedIn•More Data Scientist•LinkedIn Data Scientist•LinkedIn Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.