Model comment counts and detect anomalies

Q: Model comment counts and detect anomalies

This question evaluates competency in statistical modeling of heavy-tailed count data, model selection and comparison, and the formulation of robust monitoring metrics for anomaly detection, testing both theoretical understanding and applied data-science skills.

Q: How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

Question

Modeling Heavy-Tailed Comment Counts and Robust Monitoring

You are analyzing daily comment counts at the post–day level. The distribution is heavy-tailed. From a recent period you observe:

Sample mean = 4
Sample variance = 50

Tasks:

(a) Test whether a Poisson model is appropriate. If not, propose an alternative (negative binomial or discrete lognormal/Poisson–lognormal) and outline how to estimate parameters.

(b) Compare model fits using likelihood-based tests (likelihood ratio for nested models; Vuong test for non-nested models). Explain which tail behavior each model captures.

(c) Define robust monitoring metrics (e.g., trimmed mean, P50/median, Gini) and specify control limits suitable for detecting manipulation (e.g., purchased comments) under heavy tails.

(d) Suppose the daily 99th percentile (P99) suddenly increases 3× while the median (P50) remains stable. Propose a practical rule to trigger investigation while minimizing false alarms.

Model comment counts and detect anomalies

Modeling Heavy-Tailed Comment Counts and Robust Monitoring

Solution

Comments (0)

Model comment counts and detect anomalies

Overview

Modeling Heavy-Tailed Comment Counts and Robust Monitoring

Solution

Comments (0)