Characterize and compare transfer-count distributions over time
Company: Meta
Role: Data Scientist
Category: Statistics & Math
Difficulty: medium
Interview Round: Onsite
For a new cohort, let X be each user’s number of P2P transfers in the first 30 days post‑signup. 1) Argue for a plausible distributional family for X (e.g., zero‑inflated negative binomial or lognormal mixture) and justify expected zero‑mass and heavy‑tail behavior. 2) Sketch the likely ordering and approximate locations of mode, median, mean, and 95th percentile for X, explaining why they differ. 3) Describe how this distribution should evolve by day 60 (selection on retention, habit formation, fraud suppression, seasonality), and predict the directional shifts of those four summaries. 4) Recommend two robust executive‑level summaries resilient to heavy tails (e.g., trimmed mean, median‑of‑ratios) and one diagnostic (e.g., QQ plot or tail index) to validate assumptions.
Quick Answer: This question evaluates a candidate's ability to model zero-inflated, heavy-tailed count data, interpret distributional summaries and percentiles, reason about temporal shifts from retention and fraud dynamics, and identify robust summary statistics and diagnostics for a Data Scientist role.