Estimating Fake News Prevalence and Impact on Facebook
Context
Management is concerned about the volume and impact of fake news on the platform. You are asked to design a statistically sound approach to:
-
Estimate the proportion of fake news with confidence intervals.
-
Identify and analyze user-interaction metrics that capture the impact of fake news on behavior.
Assume you can sample posts and impressions over a defined time window and obtain ground-truth labels via human review.
Tasks
-
Sampling Strategy and Estimation
-
Define the target estimand(s) (e.g., proportion of fake news among posts vs among impressions).
-
Propose a random sampling design, including any stratification you would use. Justify strata and allocation.
-
Describe how you would compute prevalence estimates and confidence intervals, accounting for weights and clustering.
-
Include a sample size calculation and how you would handle low-prevalence scenarios.
-
Impact Assessment via User-Interaction Metrics
-
List the key behavioral KPIs (e.g., clicks, shares, dwell time) you would track to assess impact.
-
Outline the statistical approach to compare fake vs non-fake content (significance testing, regression/matching controls, multiple comparisons).