PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Statistics & Math/TikTok

Optimize threshold using confusion matrix and costs

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of classification metrics, calibration, threshold selection, and cost-sensitive decision theory in imbalanced binary classification, involving precision/recall/F1 computation, expected-cost comparison from confusion matrices, and derivation of a cost-optimal probability threshold.

  • medium
  • TikTok
  • Statistics & Math
  • Data Scientist

Optimize threshold using confusion matrix and costs

Company: TikTok

Role: Data Scientist

Category: Statistics & Math

Difficulty: medium

Interview Round: Technical Screen

A calibrated classifier predicts a 1% positive class. For 10,000 held-out examples you observe: at threshold 0.50 → TP=60, FP=40, FN=40, TN=9,860; at threshold 0.20 → TP=85, FP=300, FN=15, TN=9,600. (1) Compute Precision, Recall, and F1 at both thresholds. (2) With a cost matrix FP=1 and FN=20 (TP,TN have zero cost), compute expected cost at the two thresholds and choose the cheaper threshold; show your math. (3) Explain why ROC-AUC can be misleading here and why PR-AUC is more appropriate; give a brief numeric intuition using the counts above. (4) If you were to set the threshold using cost-sensitive decision theory on a perfectly calibrated model, derive the optimal probability threshold t* in terms of FP and FN costs and the class prior.

Quick Answer: This question evaluates understanding of classification metrics, calibration, threshold selection, and cost-sensitive decision theory in imbalanced binary classification, involving precision/recall/F1 computation, expected-cost comparison from confusion matrices, and derivation of a cost-optimal probability threshold.

Related Interview Questions

  • Explain Type I/II errors vs precision/recall - TikTok (easy)
  • Compute cluster-aware significance and sequential corrections - TikTok (medium)
  • Model overdispersed counts; estimate treatment lift - TikTok (Medium)
  • Decide if subgroup increases imply overall increase - TikTok (medium)
  • Control confounding in observational ad lift - TikTok (hard)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Statistics & Math
4
0

Calibrated Classifier on an Imbalanced Dataset (1% positives)

You have a perfectly calibrated binary classifier evaluated on 10,000 held-out examples. The true positive rate (prevalence) is 1% (i.e., about 100 positives).

You observe the following confusion matrices at two probability thresholds:

  • Threshold = 0.50 → TP = 60, FP = 40, FN = 40, TN = 9,860
  • Threshold = 0.20 → TP = 85, FP = 300, FN = 15, TN = 9,600

Tasks:

  1. Compute Precision, Recall, and F1-score at both thresholds.
  2. With a cost matrix where FP costs 1 and FN costs 20 (TP and TN cost 0), compute the expected total cost at both thresholds and choose the cheaper threshold. Show your math.
  3. Explain why ROC-AUC can be misleading in this setting and why PR-AUC is more appropriate. Provide a brief numeric intuition using the counts above.
  4. For a perfectly calibrated model, derive the optimal probability threshold t* using cost-sensitive decision theory in terms of FP and FN costs and the class prior. State any simplifying assumptions you make.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.