PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Google

Build and evaluate bad-link classifier

Last updated: Apr 17, 2026

Quick Overview

This question evaluates proficiency in applied machine learning classification, including feature design, training a logistic regression, handling severe class imbalance, selecting evaluation metrics and calibration, choosing thresholds under asymmetric costs, and planning offline-to-online validation and monitoring.

  • Medium
  • Google
  • Machine Learning
  • Data Scientist

Build and evaluate bad-link classifier

Company: Google

Role: Data Scientist

Category: Machine Learning

Difficulty: Medium

Interview Round: Technical Screen

You have 1,000 URLs labeled as bad or good and a much larger unlabeled pool, with bad links rare. Design features and train a logistic regression. Explain your evaluation plan under class imbalance: stratified K-folds, ROC-AUC vs PR-AUC, calibration (reliability curves), and why accuracy is misleading. Choose a decision threshold by minimizing expected misclassification cost given asymmetric costs. Discuss class weighting or resampling, leakage checks, monitoring for dataset shift between labeled and production traffic, and an offline-to-online validation plan with shadow or canary deployment.

Quick Answer: This question evaluates proficiency in applied machine learning classification, including feature design, training a logistic regression, handling severe class imbalance, selecting evaluation metrics and calibration, choosing thresholds under asymmetric costs, and planning offline-to-online validation and monitoring.

Related Interview Questions

  • Explain ranking cold-start strategies - Google (medium)
  • Explain LLM fine-tuning and generative models - Google (medium)
  • Compare NLP tokenization and LLM recommendations - Google (medium)
  • Explain LLM lifecycle and trade-offs - Google (medium)
  • Build a bigram next-word predictor with weighted sampling - Google (medium)
|Home/Machine Learning/Google

Build and evaluate bad-link classifier

Google logo
Google
Oct 13, 2025, 9:49 PM
MediumData ScientistTechnical ScreenMachine Learning
3
0

You have 1,000 URLs labeled as bad or good and a much larger unlabeled pool, with bad links rare. Design features and train a logistic regression. Explain your evaluation plan under class imbalance: stratified K-folds, ROC-AUC vs PR-AUC, calibration (reliability curves), and why accuracy is misleading. Choose a decision threshold by minimizing expected misclassification cost given asymmetric costs. Discuss class weighting or resampling, leakage checks, monitoring for dataset shift between labeled and production traffic, and an offline-to-online validation plan with shadow or canary deployment.

Loading comments...

Browse More Questions

More Machine Learning•More Google•More Data Scientist•Google Data Scientist•Google Machine Learning•Data Scientist Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.