Decide launch of downranking suspected bad sellers
Company: TikTok
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
You propose downranking suspected bad sellers in marketplace search results. Should we launch? Design the decision framework and experiment: (a) Define treatment precisely (e.g., push listings from risk-scored sellers down by k ranks or apply a multiplicative penalty to ranking score). (b) Choose a randomization unit that controls interference: compare session-level, query-level, and seller-level cluster randomization; justify one and describe how you would prevent cross-arm contamination in the same search page. (c) Define primary success metrics and guardrails with exact formulas (numerators/denominators and units): e.g., chargeback_rate = chargebacks/orders, complaint_per_1k_orders, bad_seller_impressions_share, GMV, add-to-cart rate, search CTR, price index, selection coverage. (d) Propose a ramp plan (1%→5%→10%→50%) with stop/go criteria and a pre-specified analysis window; include a minimal detectable effect and sample size plan for rare-event metrics. (e) Handle model uncertainty: how do offline precision/recall and false positives affect the expected treatment effect, and how would you stratify or bandit the penalty by risk score? (f) What heterogeneity and unintended effects would you check (e.g., new-seller cold start, category-level impact, geographic fairness), and how would you mitigate them?
Quick Answer: This question evaluates a data scientist's competency in experimental design, causal inference, metric definition, randomized treatment assignment and interference control, sample-size and ramp planning, and fairness-aware decision frameworks for marketplace ranking interventions.