Tiktok Data Scientist Machine Learning Interview Questions

Master your tech interview with our curated database of real questions from top companies.

14Questions

1Company

14 Questions 1 Company

Daily Quest

Showing 14 results

TikTok

Medium

Data Scientist

Compare Random Forests and Boosted Trees: Bias, Variance, Speed

Scenario A product/data science team is deciding between Random Forests and Gradient-Boosted Decision Trees (e.g., XGBoost) for a new predictive task....

Predict Customer Churn with Machine Learning Workflow

Predicting Monthly Churn: End-to-End Workflow Scenario A subscription platform wants to predict whether a customer will churn in the next month. Assum...

Design Real-Time Credit Card Fraud Detection System

Real-Time Credit-Card Fraud Detection System Design Scenario You are designing a real-time fraud detection system for an online payments platform that...

Design an ad-selection system across objectives

End-to-End Ad-Selection System Design Context You must choose, at impression time, which advertiser type to show to a user. There are three advertiser...

Detect and suppress bad sellers robustly

System Design: Identify and Suppress Bad Sellers in a Commerce Marketplace Context You are designing an ML-driven risk system for a large-scale market...

Explain and tune XGBoost; prevent overfitting

XGBoost Tree Booster: Objective, Hyperparameters, Tuning for Imbalanced Detection, and Post-training Use Context: You are building a binary classifier...

Explain SHAP vs VIF under collinearity

High Collinearity in Binary Classification: VIF, SHAP, and Interpretation Strategy You are modeling a binary outcome Y. Two numeric features A and B a...

Choose linear regression or decision tree appropriately

Choose Between Linear Regression and a Decision Tree Under a Hinge and Interaction DGP Context You have 100,000 i.i.d. observations with features x1 (...

Contrast LSTM and Transformer for long sequences

Train a Long-Context Autoregressive LM (T = 8192, H = 512, B = 8) You are training an autoregressive language model with: - Sequence length T = 8192 t...

Compare bagging vs boosting on imbalanced data

Fraud Detection on 10M Time-Ordered Transactions (0.5% Fraud) You are building a binary classifier to detect 0.5% fraudulent events among 10,000,000 t...

Estimate heterogeneous treatment effects with causal ML

Context You are given large-scale, logged observational data from an always-on promotion. Each record contains features X (user/context), a binary tre...

Predict User Churn with Effective Modeling Techniques

Predicting User Churn for a Subscription App Context You are building a model to predict which active subscribers are likely to churn soon so the team...

Personalize Ad Delivery Using Machine Learning Techniques

Personalized Delivery of Three Ad Categories Scenario You operate a consumer feed with a single ad opportunity per request and three possible ad categ...

Choose Between Random Forests and Gradient Boosting Models

Scenario Product-facing data science interview on choosing and configuring tree-based ensemble models for tabular prediction in a production setting. ...

Machine Learning

Aug 4, 2025