This question evaluates proficiency in statistical modeling of discrete count data—specifically overdispersion, zero inflation, heavy tails, model selection, and inference with uncertainty quantification—within the Statistics & Math domain, and requires both conceptual understanding and practical application.

You observe daily comment counts per post on a large social app are highly skewed with many zeros. a) Choose an appropriate discrete model among Poisson, Negative Binomial, Poisson–lognormal, and discrete power-law (with xmin) and justify via overdispersion, tail heaviness, and zero inflation. b) Lay out a step-by-step model selection plan: check mean–variance relationship; run a dispersion test; fit Poisson and NB via MLE and compare with a likelihood-ratio test; fit Poisson–lognormal and compare with NB using AIC/BIC and Vuong's test; for the upper tail, estimate xmin by minimizing the Kolmogorov–Smirnov statistic and compare power-law vs. lognormal tails; use QQ-plots and posterior predictive checks. c) Using your chosen model, estimate P(Y ≤ 1) and the 95th percentile of comments per post with bootstrap 95% CIs; explain how you would compute standard errors and guard against small-sample bias. d) Explain how left-truncation (e.g., posts with 0 comments sometimes missing), right-censoring (e.g., late-arriving comments), and mixture segments (bots vs. humans) would bias estimates and how to correct (e.g., zero-inflated NB, truncated likelihood). e) If threaded replies launch next month, predict qualitatively how parameters would change (e.g., higher variance, heavier tail) and how you would revalidate model fit one week post-launch.