PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/TikTok

Choose linear regression or decision tree appropriately

Last updated: Mar 29, 2026

Quick Overview

This question evaluates model selection and diagnostic skills in supervised learning, specifically assessing feature engineering, interaction detection, handling heteroskedastic residuals, incorporation of monotonicity or interaction constraints in tree-based models, and fair cross-validation-based comparison between linear and tree approaches.

  • medium
  • TikTok
  • Machine Learning
  • Data Scientist

Choose linear regression or decision tree appropriately

Company: TikTok

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

You have 100,000 i.i.d. rows with features: x1 (0–100), x2, x3, and y. Unknown to you, the DGP is piecewise linear with a hinge at x1=50 and an interaction y ≈ 3x1 + 20·I[x1>50] + 2·x2x3 + ε, with heteroskedastic noise Var(ε|x1)=0.01·(1+x1). Design an analysis to decide between linear regression and a decision tree. Specify: (1) feature engineering and tests you would run for linearity (e.g., spline basis for x1, x2:x3 interaction) and how you would check residual diagnostics for heteroskedasticity; (2) a fair comparison protocol (CV split, identical preprocessing) and metrics; (3) how you would enforce monotonicity or interaction constraints in a tree-based model to reflect domain knowledge; (4) which model you expect to generalize better here and why, including bias–variance reasoning and how you would quantify it with learning curves.

Quick Answer: This question evaluates model selection and diagnostic skills in supervised learning, specifically assessing feature engineering, interaction detection, handling heteroskedastic residuals, incorporation of monotonicity or interaction constraints in tree-based models, and fair cross-validation-based comparison between linear and tree approaches.

Related Interview Questions

  • Design multimodal deployment under compute limits - TikTok (easy)
  • Explain overfitting, dropout, normalization, RL post-training - TikTok (medium)
  • Write self-attention and cross-entropy pseudocode - TikTok (medium)
  • Implement AUC-ROC, softmax, and logistic regression - TikTok (medium)
  • Answer ML fundamentals and diagnostics questions - TikTok (hard)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
1
0

Choose Between Linear Regression and a Decision Tree Under a Hinge and Interaction DGP

Context

You have 100,000 i.i.d. observations with features x1 (range 0–100), x2, x3, and target y. The true data-generating process (unknown to you) is piecewise linear with a hinge at x1 = 50 and an interaction between x2 and x3:

  • y ≈ 3·x1 + 20·I[x1 > 50] + 2·(x2·x3) + ε
  • Heteroskedastic noise: Var(ε | x1) = 0.01·(1 + x1)

Task

Design an analysis to decide between linear regression and a decision tree. Specify:

  1. Feature engineering and tests you would run for linearity (e.g., spline basis for x1, x2:x3 interaction) and how you would check residual diagnostics for heteroskedasticity.
  2. A fair comparison protocol (CV split, identical preprocessing) and metrics.
  3. How you would enforce monotonicity or interaction constraints in a tree-based model to reflect domain knowledge.
  4. Which model you expect to generalize better here and why, including bias–variance reasoning and how you would quantify it with learning curves.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.