PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Tradedesk

Design a CVR model for RTB bidding

Last updated: Jun 15, 2026

Quick Overview

This Trade Desk data-science screen asks you to design an end-to-end conversion-rate (CVR) model for real-time bidding (RTB) at a DSP. It covers RTB system roles, label/attribution-window definition, bid-time feature engineering and leakage, logistic-regression vs LightGBM tradeoffs, log-loss training, class-imbalance handling with calibration, PR-AUC vs ROC-AUC evaluation, and latency-aware, safely-deployed production.

  • easy
  • Tradedesk
  • Machine Learning
  • Data Scientist

Design a CVR model for RTB bidding

Company: Tradedesk

Role: Data Scientist

Category: Machine Learning

Difficulty: easy

Interview Round: Technical Screen

##### Question You are a data scientist at a Demand-Side Platform (DSP) such as The Trade Desk, participating in **Real-Time Bidding (RTB)**. For each ad opportunity (impression), your system must decide in tens of milliseconds **whether to bid, how much to bid, and which creative/ad to show**. You are asked to design an end-to-end ML approach to predict **conversion probability (CVR)** for a campaign (e.g., “Nike shoes”). Historical data you may have: - **Impressions table**: `impression_id`, timestamp, user/device/context, publisher/app/site, geo, auction metadata, bid price, win/loss, etc. - **Clicks table**: `impression_id`, click timestamp (optional, sparse) - **Conversions table**: `impression_id` (or user-level attribution key), conversion timestamp/value (very sparse) Address the following: 1. **RTB system understanding.** Explain what RTB is and the roles of the **advertiser**, the **ad exchange**, and the **DSP**. When an ad opportunity arrives, walk through what happens in milliseconds. How does the DSP decide whether to bid, how much to bid, and which ad/creative to show? 2. **Learning target.** Clearly define what “predict conversion over impression” means here. Choose an attribution window and a precise label definition. What is the prediction unit and time window (e.g., conversion within 7 days of impression)? How do you handle attribution rules (last-click vs view-through) and label delay/censoring? 3. **Feature engineering.** Propose a realistic RTB feature set that is available **at bid time**, organized across user/context, publisher/placement, device/geo/time, ad/creative, advertiser/campaign, frequency/recency, and historical aggregates. Discuss leakage risks. 4. **Model choice.** Choose a baseline and a production candidate — compare **logistic regression** vs **gradient-boosted trees (e.g., LightGBM)** in this setting and explain the tradeoffs. 5. **Loss function.** What loss would you train on and why? Explain why you would use **log loss / binary cross-entropy**, why MSE is not appropriate, and why AUC is not used as a training loss. What does “predicting conversion over impression” mean for supervision/labeling, and how do loss functions relate to bidding decisions? 6. **Class imbalance.** Conversions are rare. Describe at least two ways to handle imbalance (e.g., class weighting, negative downsampling), when to use each, what preprocessing to avoid, and how these choices affect probability **calibration** (including how to correct for downsampling). 7. **Evaluation.** Define offline metrics and a validation scheme for CVR in a non-stationary ad-tech environment (PR-AUC, ROC-AUC, log loss, calibration). Explain why PR-AUC can be more informative than ROC-AUC and why calibration matters. How would you evaluate the model online, and what business metrics matter (e.g., CPA, ROAS, spend efficiency)? 8. **Precision/recall tradeoff in RTB.** How do false positives vs false negatives differ in cost? What is the F1 score, and why might it be a poor objective for ad-tech bidding? How would you use a PR curve to select an operating point? 9. **Scalability & production.** Discuss training vs inference scalability for LightGBM, the latency bottlenecks in the feature/model pipeline, and how you would deploy safely (shadow mode, ramp-up, rollback). 10. **Overfitting & robustness.** Why is overfitting common in CVR prediction, and how do you prevent it (regularization, early stopping, time-based validation, feature aggregation)? What monitoring and guardrails would you add for a live bidding system? Provide a structured, end-to-end answer with explicit assumptions and tradeoffs.

Quick Answer: This Trade Desk data-science screen asks you to design an end-to-end conversion-rate (CVR) model for real-time bidding (RTB) at a DSP. It covers RTB system roles, label/attribution-window definition, bid-time feature engineering and leakage, logistic-regression vs LightGBM tradeoffs, log-loss training, class-imbalance handling with calibration, PR-AUC vs ROC-AUC evaluation, and latency-aware, safely-deployed production.

Tradedesk logo
Tradedesk
Oct 18, 2025, 12:00 AM
Data Scientist
Technical Screen
Machine Learning
3
0
Question

You are a data scientist at a Demand-Side Platform (DSP) such as The Trade Desk, participating in Real-Time Bidding (RTB). For each ad opportunity (impression), your system must decide in tens of milliseconds whether to bid, how much to bid, and which creative/ad to show. You are asked to design an end-to-end ML approach to predict conversion probability (CVR) for a campaign (e.g., “Nike shoes”).

Historical data you may have:

  • Impressions table : impression_id , timestamp, user/device/context, publisher/app/site, geo, auction metadata, bid price, win/loss, etc.
  • Clicks table : impression_id , click timestamp (optional, sparse)
  • Conversions table : impression_id (or user-level attribution key), conversion timestamp/value (very sparse)

Address the following:

  1. RTB system understanding. Explain what RTB is and the roles of the advertiser , the ad exchange , and the DSP . When an ad opportunity arrives, walk through what happens in milliseconds. How does the DSP decide whether to bid, how much to bid, and which ad/creative to show?
  2. Learning target. Clearly define what “predict conversion over impression” means here. Choose an attribution window and a precise label definition. What is the prediction unit and time window (e.g., conversion within 7 days of impression)? How do you handle attribution rules (last-click vs view-through) and label delay/censoring?
  3. Feature engineering. Propose a realistic RTB feature set that is available at bid time , organized across user/context, publisher/placement, device/geo/time, ad/creative, advertiser/campaign, frequency/recency, and historical aggregates. Discuss leakage risks.
  4. Model choice. Choose a baseline and a production candidate — compare logistic regression vs gradient-boosted trees (e.g., LightGBM) in this setting and explain the tradeoffs.
  5. Loss function. What loss would you train on and why? Explain why you would use log loss / binary cross-entropy , why MSE is not appropriate, and why AUC is not used as a training loss. What does “predicting conversion over impression” mean for supervision/labeling, and how do loss functions relate to bidding decisions?
  6. Class imbalance. Conversions are rare. Describe at least two ways to handle imbalance (e.g., class weighting, negative downsampling), when to use each, what preprocessing to avoid, and how these choices affect probability calibration (including how to correct for downsampling).
  7. Evaluation. Define offline metrics and a validation scheme for CVR in a non-stationary ad-tech environment (PR-AUC, ROC-AUC, log loss, calibration). Explain why PR-AUC can be more informative than ROC-AUC and why calibration matters. How would you evaluate the model online, and what business metrics matter (e.g., CPA, ROAS, spend efficiency)?
  8. Precision/recall tradeoff in RTB. How do false positives vs false negatives differ in cost? What is the F1 score, and why might it be a poor objective for ad-tech bidding? How would you use a PR curve to select an operating point?
  9. Scalability & production. Discuss training vs inference scalability for LightGBM, the latency bottlenecks in the feature/model pipeline, and how you would deploy safely (shadow mode, ramp-up, rollback).
  10. Overfitting & robustness. Why is overfitting common in CVR prediction, and how do you prevent it (regularization, early stopping, time-based validation, feature aggregation)? What monitoring and guardrails would you add for a live bidding system?

Provide a structured, end-to-end answer with explicit assumptions and tradeoffs.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Tradedesk•More Data Scientist•Tradedesk Data Scientist•Tradedesk Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.