PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/OneMain Financial

Select and tune XGBoost hyperparameters

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in selecting and tuning XGBoost hyperparameters, managing severe class imbalance and sparse one‑hot encodings, handling missing values, and designing compute‑efficient training and grouped cross‑validation to prevent user‑level leakage.

  • hard
  • OneMain Financial
  • Machine Learning
  • Data Scientist

Select and tune XGBoost hyperparameters

Company: OneMain Financial

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

You have a binary classification dataset with 1,000,000 rows, 100 features (20 numeric, 80 categorical one-hot encoded), and a positive class rate of 1%. Training must finish in ≤5 minutes on a single 16-core CPU with 32 GB RAM. 1) Propose initial XGBoost hyperparameters (eta/learning_rate, max_depth, min_child_weight, subsample, colsample_bytree, lambda, alpha, n_estimators, max_bin or tree_method) and justify each in terms of bias–variance, class imbalance, and compute constraints. 2) Describe an efficient tuning strategy (search space, early stopping, cross-validation scheme that prevents leakage from users appearing in multiple folds). 3) Explain exactly how XGBoost handles missing values during tree splitting and how that interacts with one-hot encoding vs target encoding. 4) Given severe minority-class scarcity, compare using scale_pos_weight vs weighted loss vs focal loss; when would each be preferable?

Quick Answer: This question evaluates skills in selecting and tuning XGBoost hyperparameters, managing severe class imbalance and sparse one‑hot encodings, handling missing values, and designing compute‑efficient training and grouped cross‑validation to prevent user‑level leakage.

Related Interview Questions

  • Explain decision trees and tree ensembles - OneMain Financial (easy)
  • Choose evaluation metrics for imbalanced risk model - OneMain Financial (medium)
  • Handle missing data and outliers robustly - OneMain Financial (hard)
  • Handle Missing Values and Outliers in Machine Learning - OneMain Financial (medium)
OneMain Financial logo
OneMain Financial
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
3
0

Binary Classification Under Compute and Imbalance Constraints

Context

You are training an XGBoost model for a binary classification problem with:

  • 1,000,000 rows, 100 features (20 numeric, 80 categorical that are one‑hot encoded)
  • Positive class rate ≈ 1% (10,000 positives / 990,000 negatives)
  • Hardware: single 16‑core CPU, 32 GB RAM
  • Wall‑clock training time budget: ≤ 5 minutes

Assume you can provide a user_id to group rows (to prevent leakage in validation) and that features may contain missing values (NaNs). The one‑hot columns are sparse 0/1 indicators.

Tasks

  1. Propose initial XGBoost hyperparameters (eta/learning_rate, max_depth, min_child_weight, subsample, colsample_bytree, lambda, alpha, n_estimators, max_bin or tree_method) and justify each in terms of bias–variance, class imbalance, and compute constraints.
  2. Describe an efficient tuning strategy: search space, early stopping, and a cross‑validation scheme that prevents leakage from users appearing in multiple folds.
  3. Explain exactly how XGBoost handles missing values during tree splitting and how that interacts with one‑hot encoding vs target encoding.
  4. Given severe minority‑class scarcity, compare using scale_pos_weight vs weighted loss vs focal loss; when would each be preferable?

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More OneMain Financial•More Data Scientist•OneMain Financial Data Scientist•OneMain Financial Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.