PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Google

Build Classifier: Evaluate with AUROC for Imbalanced Data

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in building and evaluating binary classifiers for imbalanced datasets, focusing on model design, evaluation metric selection, and performance interpretation.

  • medium
  • Google
  • Machine Learning
  • Data Scientist

Build Classifier: Evaluate with AUROC for Imbalanced Data

Company: Google

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

##### Scenario Detecting dead links with a data set of 1,000 labeled URLs (good vs. bad). ##### Question Walk through how you would build and evaluate a classifier. Which metric would you choose and why might AUROC be preferred over accuracy for imbalanced data? ##### Hints Logistic regression plus class-imbalance-aware metrics.

Quick Answer: This question evaluates a data scientist's competency in building and evaluating binary classifiers for imbalanced datasets, focusing on model design, evaluation metric selection, and performance interpretation.

Related Interview Questions

  • Explain ranking cold-start strategies - Google (medium)
  • Explain LLM fine-tuning and generative models - Google (medium)
  • Compare NLP tokenization and LLM recommendations - Google (medium)
  • Explain LLM lifecycle and trade-offs - Google (medium)
  • Build a bigram next-word predictor with weighted sampling - Google (medium)
Google logo
Google
Jul 12, 2025, 6:59 PM
Data Scientist
Technical Screen
Machine Learning
29
0

Detecting Dead Links: Build and Evaluate a Classifier

Scenario

You have a dataset of 1,000 URLs labeled as good (alive) or bad (dead). The classes are likely imbalanced (e.g., far fewer dead links than good ones).

Task

  1. Describe how you would build the classifier end-to-end (data prep, features, model, validation, and deployment considerations).
  2. Explain which evaluation metric(s) you would choose for imbalanced data.
  3. Clarify why AUROC might be preferred over accuracy when the classes are imbalanced.

Hint: A strong baseline is logistic regression with class-imbalance-aware metrics.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Google•More Data Scientist•Google Data Scientist•Google Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.