PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Machine Learning/TikTok

Explain SHAP vs VIF under collinearity

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of multicollinearity diagnostics (VIF), feature attribution methods (SHAP), and the interpretability and validation implications of near-duplicate predictors in binary classification models.

  • hard
  • TikTok
  • Machine Learning
  • Data Scientist

Explain SHAP vs VIF under collinearity

Company: TikTok

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

Binary outcome with features A and B, corr(A,B)=0.98, plus other weakly correlated features. You fit (i) logistic regression and (ii) a gradient-boosted tree. (1) Compute/estimate VIF for A and B and interpret thresholds that indicate problematic multicollinearity. (2) Explain how SHAP values behave when A and B are near-duplicates (interventional vs conditional SHAP), and why attributions may be unstable or split unpredictably across A and B. (3) Propose a defensible interpretation workflow: feature clustering/grouped SHAP, permutation importance conditional on the other, and re-fitting after removing one of the pair; describe the diagnostics you would expect to see. (4) Recommend modeling changes (e.g., elastic net for GLM or feature grouping/penalization for trees) and how you would validate that interpretability and predictive performance are both acceptable.

Quick Answer: This question evaluates understanding of multicollinearity diagnostics (VIF), feature attribution methods (SHAP), and the interpretability and validation implications of near-duplicate predictors in binary classification models.

Related Interview Questions

  • Design multimodal deployment under compute limits - TikTok (easy)
  • Explain overfitting, dropout, normalization, RL post-training - TikTok (medium)
  • Write self-attention and cross-entropy pseudocode - TikTok (medium)
  • Implement AUC-ROC, softmax, and logistic regression - TikTok (medium)
  • Answer ML fundamentals and diagnostics questions - TikTok (hard)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
2
0

High Collinearity in Binary Classification: VIF, SHAP, and Interpretation Strategy

You are modeling a binary outcome Y. Two numeric features A and B are highly correlated: corr(A, B) = 0.98. Other features exist but are only weakly correlated with A and B. You fit:

  • (i) a logistic regression (GLM), and
  • (ii) a gradient-boosted tree model (GBDT).

Answer the following:

  1. Variance Inflation Factor (VIF)
    • Compute/estimate the VIF for A and B given corr(A, B) = 0.98 (assume other features add little to R²).
    • Interpret typical VIF/tolerance thresholds that indicate problematic multicollinearity and what that means for logistic regression coefficients.
  2. SHAP with near-duplicate features
    • Explain how SHAP values behave when A and B are near-duplicates under interventional vs conditional SHAP formulations.
    • Why can attributions be unstable or split unpredictably across A and B (for both GLM and GBDT)?
  3. Defensible interpretation workflow
    • Propose a workflow to interpret such models: e.g., feature clustering and grouped SHAP, permutation importance conditional on the other feature, and re-fitting after removing one of the pair.
    • Describe the diagnostics you expect to see if A and B are redundant.
  4. Modeling recommendations and validation
    • Recommend modeling changes (e.g., elastic net for GLM; feature grouping or regularization choices for trees) to handle collinearity.
    • Describe how you would validate that both interpretability and predictive performance remain acceptable.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.