PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Intuit

Engineer ZIP Features and Handle Missingness

Last updated: Apr 2, 2026

Quick Overview

This question evaluates feature engineering with geographic data, strategies for handling missing ZIP codes and external data joins, and awareness of fairness, leakage, privacy, and validation concerns when using demographic variables.

  • medium
  • Intuit
  • Machine Learning
  • Data Scientist

Engineer ZIP Features and Handle Missingness

Company: Intuit

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

You are building a predictive model for users, and for some users you have address data. Many records include ZIP code, but some do not. You may also join external public data, such as census-style demographic summaries, using ZIP code. 1. What address-derived or ZIP-linked features would you consider using as model inputs? 2. How would you handle missing ZIP codes? 3. What risks would you watch for when using geographic and demographic variables, and how would you test whether these features actually help the model? Assume the target is not specified; answer in a general way that would be appropriate for a product-focused data science interview.

Quick Answer: This question evaluates feature engineering with geographic data, strategies for handling missing ZIP codes and external data joins, and awareness of fairness, leakage, privacy, and validation concerns when using demographic variables.

Related Interview Questions

  • When should products use AI? - Intuit (easy)
  • Engineer and Impute ZIP Features - Intuit (medium)
  • Handle missing and unavailable predictive features - Intuit (easy)
  • Decide when to model courier ETA - Intuit (hard)
  • Build a predictive model from TurboTax sample data - Intuit (easy)
Intuit logo
Intuit
Feb 10, 2026, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
1
0

You are building a predictive model for users, and for some users you have address data. Many records include ZIP code, but some do not. You may also join external public data, such as census-style demographic summaries, using ZIP code.

  1. What address-derived or ZIP-linked features would you consider using as model inputs?
  2. How would you handle missing ZIP codes?
  3. What risks would you watch for when using geographic and demographic variables, and how would you test whether these features actually help the model?

Assume the target is not specified; answer in a general way that would be appropriate for a product-focused data science interview.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Intuit•More Data Scientist•Intuit Data Scientist•Intuit Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.