Context
You have three canonical tables for a social product:
-
users(user_id, join_date, country, device, …)
-
posts(post_id, author_user_id, created_at, …)
-
comments(comment_id, post_id, commenter_user_id, created_at, parent_comment_id [nullable], is_deleted, …)
Assume timestamps are available to compute daily/weekly activity and that a user is “active” on a day if they view or create content (define precisely in your analysis). You want to understand engagement through the lens of comments and evaluate a new comment feature.
Tasks
-
Analyze the distribution of user commenting to understand engagement patterns.
-
Define core metrics that summarize comment activity and inequality/long-tail effects.
-
If a new comment feature is launched, outline the statistical/experimental steps to isolate its causal impact.