PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Machine Learning/TikTok

Implement AUC-ROC, softmax, and logistic regression

Last updated: Mar 29, 2026

Quick Overview

This question evaluates practical implementation skills for core machine learning components—computing AUC-ROC (including ROC point generation and handling ties or uniform labels), implementing a numerically stable softmax, and training logistic regression with cross-entropy loss and optional L2 regularization.

  • medium
  • TikTok
  • Machine Learning
  • Software Engineer

Implement AUC-ROC, softmax, and logistic regression

Company: TikTok

Role: Software Engineer

Category: Machine Learning

Difficulty: medium

Interview Round: Onsite

You are asked to implement a few core ML building blocks from scratch (no ML libraries such as scikit-learn). You may use basic numeric operations and standard data structures. ## Part A — AUC-ROC Given: - `y_true`: length `n` list/array of binary labels in `{0,1}` - `y_score`: length `n` list/array of real-valued prediction scores (higher means more likely positive) Task: 1. Compute the ROC curve points (TPR vs FPR) as the threshold varies. 2. Compute **AUC** (area under the ROC curve). Clarifications: - Handle ties in `y_score` correctly. - Define what you return when all labels are the same (all 0s or all 1s). ## Part B — Softmax Given a vector of logits `z = [z1, z2, ..., zk]`, implement softmax: \[ \mathrm{softmax}(z)_i = \frac{e^{z_i}}{\sum_{j=1}^k e^{z_j}} \] Task: - Return a probability vector of length `k`. - Make your implementation numerically stable. ## Part C — Logistic Regression Given: - Feature matrix `X` of shape `(n, d)` - Binary labels `y` of shape `(n,)` in `{0,1}` Task: 1. Implement logistic regression training using gradient descent (batch or mini-batch). 2. Specify the loss you optimize (cross-entropy / negative log-likelihood). 3. Optionally include L2 regularization and explain how it changes the gradient. 4. Return learned parameters and a `predict_proba` / `predict` function. Constraints/expectations: - Discuss time complexity per training epoch. - Mention common pitfalls (learning rate, feature scaling, overflow, class imbalance).

Quick Answer: This question evaluates practical implementation skills for core machine learning components—computing AUC-ROC (including ROC point generation and handling ties or uniform labels), implementing a numerically stable softmax, and training logistic regression with cross-entropy loss and optional L2 regularization.

Related Interview Questions

  • Design multimodal deployment under compute limits - TikTok (easy)
  • Write self-attention and cross-entropy pseudocode - TikTok (medium)
  • Explain overfitting, dropout, normalization, RL post-training - TikTok (medium)
  • Answer ML fundamentals and diagnostics questions - TikTok (hard)
  • Explain FlashAttention, KV cache, and RoPE - TikTok (medium)
|Home/Machine Learning/TikTok

Implement AUC-ROC, softmax, and logistic regression

TikTok logo
TikTok
Jan 22, 2026, 12:00 AM
mediumSoftware EngineerOnsiteMachine Learning
5
0
Loading...

You are asked to implement a few core ML building blocks from scratch (no ML libraries such as scikit-learn). You may use basic numeric operations and standard data structures.

Part A — AUC-ROC

Given:

  • y_true : length n list/array of binary labels in {0,1}
  • y_score : length n list/array of real-valued prediction scores (higher means more likely positive)

Task:

  1. Compute the ROC curve points (TPR vs FPR) as the threshold varies.
  2. Compute AUC (area under the ROC curve).

Clarifications:

  • Handle ties in y_score correctly.
  • Define what you return when all labels are the same (all 0s or all 1s).

Part B — Softmax

Given a vector of logits z = [z1, z2, ..., zk], implement softmax:

softmax(z)i=ezi∑j=1kezj\mathrm{softmax}(z)_i = \frac{e^{z_i}}{\sum_{j=1}^k e^{z_j}}softmax(z)i​=∑j=1k​ezj​ezi​​

Task:

  • Return a probability vector of length k .
  • Make your implementation numerically stable.

Part C — Logistic Regression

Given:

  • Feature matrix X of shape (n, d)
  • Binary labels y of shape (n,) in {0,1}

Task:

  1. Implement logistic regression training using gradient descent (batch or mini-batch).
  2. Specify the loss you optimize (cross-entropy / negative log-likelihood).
  3. Optionally include L2 regularization and explain how it changes the gradient.
  4. Return learned parameters and a predict_proba / predict function.

Constraints/expectations:

  • Discuss time complexity per training epoch.
  • Mention common pitfalls (learning rate, feature scaling, overflow, class imbalance).
Loading comments...

Browse More Questions

More Machine Learning•More TikTok•More Software Engineer•TikTok Software Engineer•TikTok Machine Learning•Software Engineer Machine Learning

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.