PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/TikTok

Explain SHAP vs VIF under collinearity

Last updated: Jun 16, 2026

Quick Overview

This question evaluates understanding of multicollinearity diagnostics (VIF), feature attribution methods (SHAP), and the interpretability and validation implications of near-duplicate predictors in binary classification models.

  • hard
  • TikTok
  • Machine Learning
  • Data Scientist

Explain SHAP vs VIF under collinearity

Company: TikTok

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

Binary outcome with features A and B, corr(A,B)=0.98, plus other weakly correlated features. You fit (i) logistic regression and (ii) a gradient-boosted tree. (1) Compute/estimate VIF for A and B and interpret thresholds that indicate problematic multicollinearity. (2) Explain how SHAP values behave when A and B are near-duplicates (interventional vs conditional SHAP), and why attributions may be unstable or split unpredictably across A and B. (3) Propose a defensible interpretation workflow: feature clustering/grouped SHAP, permutation importance conditional on the other, and re-fitting after removing one of the pair; describe the diagnostics you would expect to see. (4) Recommend modeling changes (e.g., elastic net for GLM or feature grouping/penalization for trees) and how you would validate that interpretability and predictive performance are both acceptable.

Quick Answer: This question evaluates understanding of multicollinearity diagnostics (VIF), feature attribution methods (SHAP), and the interpretability and validation implications of near-duplicate predictors in binary classification models.

Related Interview Questions

  • Design multimodal deployment under compute limits - TikTok (easy)
  • Write self-attention and cross-entropy pseudocode - TikTok (medium)
  • Explain overfitting, dropout, normalization, RL post-training - TikTok (medium)
  • Answer ML fundamentals and diagnostics questions - TikTok (hard)
  • Implement AUC-ROC, softmax, and logistic regression - TikTok (medium)
|Home/Machine Learning/TikTok

Explain SHAP vs VIF under collinearity

TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
hardData ScientistTechnical ScreenMachine Learning
3
0

High Collinearity in Binary Classification: VIF, SHAP, and Interpretation Strategy

You are modeling a binary outcome Y. Two numeric features A and B are highly correlated: corr(A, B) = 0.98. Other features exist but are only weakly correlated with A and B. You fit:

  • (i) a logistic regression (GLM), and
  • (ii) a gradient-boosted tree model (GBDT).

Answer the following:

  1. Variance Inflation Factor (VIF)
    • Compute/estimate the VIF for A and B given corr(A, B) = 0.98 (assume other features add little to R²).
    • Interpret typical VIF/tolerance thresholds that indicate problematic multicollinearity and what that means for logistic regression coefficients.
  2. SHAP with near-duplicate features
    • Explain how SHAP values behave when A and B are near-duplicates under interventional vs conditional SHAP formulations.
    • Why can attributions be unstable or split unpredictably across A and B (for both GLM and GBDT)?
  3. Defensible interpretation workflow
    • Propose a workflow to interpret such models: e.g., feature clustering and grouped SHAP, permutation importance conditional on the other feature, and re-fitting after removing one of the pair.
    • Describe the diagnostics you expect to see if A and B are redundant.
  4. Modeling recommendations and validation
    • Recommend modeling changes (e.g., elastic net for GLM; feature grouping or regularization choices for trees) to handle collinearity.
    • Describe how you would validate that both interpretability and predictive performance remain acceptable.
Loading comments...

Browse More Questions

More Machine Learning•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.