PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/TikTok

Decide launch of downranking suspected bad sellers

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in experimental design, causal inference, metric definition, randomized treatment assignment and interference control, sample-size and ramp planning, and fairness-aware decision frameworks for marketplace ranking interventions.

  • hard
  • TikTok
  • Analytics & Experimentation
  • Data Scientist

Decide launch of downranking suspected bad sellers

Company: TikTok

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

You propose downranking suspected bad sellers in marketplace search results. Should we launch? Design the decision framework and experiment: (a) Define treatment precisely (e.g., push listings from risk-scored sellers down by k ranks or apply a multiplicative penalty to ranking score). (b) Choose a randomization unit that controls interference: compare session-level, query-level, and seller-level cluster randomization; justify one and describe how you would prevent cross-arm contamination in the same search page. (c) Define primary success metrics and guardrails with exact formulas (numerators/denominators and units): e.g., chargeback_rate = chargebacks/orders, complaint_per_1k_orders, bad_seller_impressions_share, GMV, add-to-cart rate, search CTR, price index, selection coverage. (d) Propose a ramp plan (1%→5%→10%→50%) with stop/go criteria and a pre-specified analysis window; include a minimal detectable effect and sample size plan for rare-event metrics. (e) Handle model uncertainty: how do offline precision/recall and false positives affect the expected treatment effect, and how would you stratify or bandit the penalty by risk score? (f) What heterogeneity and unintended effects would you check (e.g., new-seller cold start, category-level impact, geographic fairness), and how would you mitigate them?

Quick Answer: This question evaluates a data scientist's competency in experimental design, causal inference, metric definition, randomized treatment assignment and interference control, sample-size and ramp planning, and fairness-aware decision frameworks for marketplace ranking interventions.

Related Interview Questions

  • Define Ultra success metrics and detect suspicious transactions - TikTok (easy)
  • Plan DS approach for biker delivery project - TikTok (easy)
  • Define and critique a user activity metric - TikTok (easy)
  • Design and decompose Trust & Safety risk metrics - TikTok (easy)
  • Analyze promo anomaly and design risk guardrails - TikTok (Medium)
TikTok logo
TikTok
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
2
0

Experiment Design: Downranking Suspected Bad Sellers in Search

Context

  • You are designing a decision framework and online experiment to test penalizing sellers suspected of bad behavior (e.g., fraud, policy violations, poor quality) in marketplace search results. A risk model scores sellers; the intervention alters ranking for items from higher-risk sellers.
  • Goal: Reduce harmful outcomes without materially hurting buyer experience, marketplace liquidity, pricing, or fairness.

Tasks (a) Precisely define the treatment

  • Specify exactly how the ranking will be modified for risk-scored sellers (e.g., push down by k ranks or apply a multiplicative penalty to the ranking score).

(b) Choose a randomization unit that controls interference

  • Compare session-level, query-level, and seller-level cluster randomization.
  • Justify a choice and describe how to prevent cross-arm contamination within the same search page.

(c) Define primary success metrics and guardrails with exact formulas

  • Include numerators/denominators/units for: chargeback_rate, complaints_per_1k_orders, bad_seller_impressions_share, GMV, add-to-cart rate, search CTR, price index, selection coverage, latency, etc.

(d) Propose a ramp plan with stop/go criteria and a pre-specified analysis window

  • Example: 1% → 5% → 10% → 50%.
  • Include minimal detectable effect (MDE) assumptions and a sample size plan, especially for rare-event metrics.

(e) Handle model uncertainty

  • Explain how offline precision/recall and false positives affect expected treatment effect.
  • Propose stratifying or banditing the penalty by risk score.

(f) Heterogeneity and unintended effects

  • Identify heterogeneity to check (e.g., new-seller cold start, category/geography fairness) and how to mitigate issues.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More TikTok•More Data Scientist•TikTok Data Scientist•TikTok Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.