PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Upstart

Address Missing Income Bracket in California Housing Data

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in applied machine learning for regression under covariate shift, focusing on diagnosing missing support, robust modeling, and distribution-shift mitigation for an unseen low-income bracket.

  • hard
  • Upstart
  • Machine Learning
  • Data Scientist

Address Missing Income Bracket in California Housing Data

Company: Upstart

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

##### Scenario On-site ML case – income bracket missing in California housing data ##### Question Training data lack the lowest-income bracket (<$25 k). Build a model that will still perform well across all income ranges, including the unseen bracket. ##### Hints Use domain similarity, incremental retraining, covariate shift correction, transfer learning, feature scaling.

Quick Answer: This question evaluates competency in applied machine learning for regression under covariate shift, focusing on diagnosing missing support, robust modeling, and distribution-shift mitigation for an unseen low-income bracket.

Related Interview Questions

  • Explain L1 vs L2 and ridge vs lasso - Upstart (easy)
  • Implement PAVA spend-smoothing under no-borrowing constraint - Upstart (hard)
  • Derive logistic regression objective and gradients - Upstart (easy)
  • Design Push-Notification System for Airport Surge Pricing - Upstart (medium)
  • How to Architect a Personalized Ads Serving System - Upstart (hard)
Upstart logo
Upstart
Aug 4, 2025, 10:55 AM
Data Scientist
Onsite
Machine Learning
30
0

ML Case: Missing Lowest-Income Bracket in California Housing Data

Context

You're building a supervised model (regression) to predict California housing prices using a dataset similar to the classic California Housing data. One key covariate is household income. The training data contains no observations from the lowest-income bracket (< $25k), but the deployed model must perform well across all income ranges, including this unseen bracket at inference time.

Assume the deployment/test distribution will include the full income range, including < $25k. You may optionally have access to unlabeled production covariates (features only) that include the missing bracket.

Task

Design a modeling approach that achieves robust performance across all income ranges, with special attention to the unseen lowest-income bracket. Your answer should cover:

  1. Diagnostics: How you’d confirm and quantify the shift and missing support.
  2. Modeling strategy: Architectures/algorithms that extrapolate sensibly and incorporate domain knowledge.
  3. Distribution shift handling: Methods such as importance weighting, domain adaptation/transfer learning, and data augmentation (if appropriate).
  4. Feature scaling and preprocessing choices that help stability.
  5. Validation: How you will evaluate performance for the unseen bracket before production, stress tests, and uncertainty estimates.
  6. Deployment and incremental retraining plan once data from the missing bracket starts arriving.

You may reference techniques like domain similarity, incremental retraining, covariate shift correction, transfer learning, and feature scaling.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Upstart•More Data Scientist•Upstart Data Scientist•Upstart Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.