PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Reddit

Build and evaluate click prediction models

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in supervised probabilistic modeling for click-through rate prediction, including model selection, calibration, evaluation metric choice, cross-validation, hyperparameter tuning, and production concerns like serving and monitoring.

  • medium
  • Reddit
  • Machine Learning
  • Machine Learning Engineer

Build and evaluate click prediction models

Company: Reddit

Role: Machine Learning Engineer

Category: Machine Learning

Difficulty: medium

Interview Round: Technical Screen

Build a click-through prediction model from the features above. Start with a trivial baseline classifier, then train and compare logistic regression, random forest, and gradient-boosted trees. Explain why you selected each algorithm and why you did not choose plausible alternatives (e.g., SVM, Naive Bayes, simple neural networks), discussing bias–variance, interpretability, and computational trade-offs. Describe your cross-validation strategy and the hyperparameters you would tune for each model. Choose evaluation metrics appropriate for a roughly class-balanced dataset (e.g., ROC AUC, log loss, PR AUC, accuracy) and justify why these are preferred over others; explain what would change if classes were imbalanced. Finally, outline what additional improvements you would pursue with more time.

Quick Answer: This question evaluates skills in supervised probabilistic modeling for click-through rate prediction, including model selection, calibration, evaluation metric choice, cross-validation, hyperparameter tuning, and production concerns like serving and monitoring.

Related Interview Questions

  • Model y from x and interpret distributions - Reddit (medium)
  • Analyze CTR Data and Train Model - Reddit (medium)
Reddit logo
Reddit
Sep 6, 2025, 12:00 AM
Machine Learning Engineer
Technical Screen
Machine Learning
8
0

Click-Through Rate (CTR) Prediction: Build, Compare, and Justify Models

Context

You are given a tabular dataset for binary click prediction (click = 1, no click = 0). The goal is to produce well-calibrated click probabilities for ranking/decisioning. Assume features include user, content/ad, and context signals (e.g., user/device attributes, ad/category IDs, time features, historical interaction counts). The class distribution is roughly balanced (e.g., 40–60% positives).

Task

  1. Establish a trivial baseline classifier.
  2. Train and compare three models: logistic regression, random forest, and gradient-boosted trees.
  3. Explain why you selected each algorithm and why you did not choose plausible alternatives (e.g., SVM, Naive Bayes, simple neural networks), discussing bias–variance, interpretability, and computational trade-offs.
  4. Describe your cross-validation strategy and the key hyperparameters you would tune for each model.
  5. Choose evaluation metrics appropriate for a roughly class-balanced dataset (e.g., ROC AUC, log loss, PR AUC, accuracy), justify them, and explain what would change if classes were imbalanced.
  6. Outline additional improvements you would pursue with more time (features, modeling, calibration, serving/monitoring).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Reddit•More Machine Learning Engineer•Reddit Machine Learning Engineer•Reddit Machine Learning•Machine Learning Engineer Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.