Measuring Whether Friend-Authored Content Is More "Social"
Context
You work on the Info stream of a social app. You want to test the hypothesis that friend-authored content generates more social engagement (likes, comments, reshares) than unconnected content. You are limited to two event tables:
-
info_stream_views: per-view exposure logs
-
Example columns: ds (date), event_ts, viewer_id, post_id, author_id, relationship (friend | unconnected | other), view_duration_ms, session_id
-
post_reactions: per-reaction logs
-
Example columns: ds (date), event_ts, reactor_id, post_id, author_id, reaction_type (like, comment, reshare, follow, hide, report, ...)
Assume relationship is defined at view time from the reactor's perspective, and that relationship can change over time. Use ds and relationship to attribute each reaction to the reacting viewer's relationship on that day.
Tasks
-
Metrics (precise definitions and formulas)
-
Define production-ready metrics using only info_stream_views and post_reactions, including:
-
Per-user-per-day social reactions rate.
-
Reactions per post impression by relationship (Friend vs Unconnected).
-
Reactioner-unique rate.
-
Follow and reshare rates (creator growth signals).
-
Hide/report rates as quality guardrails.
-
Detail how each reaction is attributed to the viewer's relationship on that day.
-
Observational validation (2025-08-26 to 2025-09-01)
-
Design an analysis comparing Friend vs Unconnected engagement while mitigating confounding (e.g., control for post age, viewer activity, author popularity, content-type proxies, time-of-day).
-
Propose a model such as fixed-effects regression or propensity/matching at the viewer–post level.
-
Specify: unit of analysis, covariates, handling multiple views per post/viewer/day (e.g., aggregate to MAX duration), and treatment of reactions lacking a matched view row.
-
First-time launch of unconnected content (experiment design)
-
Randomization unit (user-level), treatment variants (e.g., 0%, 10%, 30% unconnected slots in Info stream).
-
Primary success metrics (e.g., net social reactions per DAU, reactions per session, time to first friend interaction).
-
Guardrails (retention, session length, hide/report rates, creator follows, long-term re-engagement) and minimal acceptable lifts.
-
Address novelty effects, learning/personalization ramp, supply constraints, and network/peer interference.
-
Include a power/duration check, key segment analyses (e.g., new vs power users; high friend-graph density), and a diff-in-diff or CUPED adjustment plan using pre-exposure baselines.