Assessing Whether Friend Content Is "More Social" Than Unconnected Content
Context and Goal
You are given two platform logs: info_stream_views (every feed view of a post) and post_reactions (likes, comments, reshares). Using these, you must:
-
Define what "more social" means with measurable metrics.
-
Build a causal observational analysis to compare Friend vs Unconnected views.
-
Propose an experiment to perturb ranking weights on relationship affinity to validate causality.
-
Quantify the incremental value of Unconnected exposure beyond near-term engagement.
Assume "Friend" means content from a viewer's graph connections and "Unconnected" means content from creators the viewer does not follow or is not directly connected to.
Available Data (minimal assumptions)
-
info_stream_views(view_id, viewer_id, post_id, author_id, source_type ∈ {Friend, Unconnected}, ts, session_id, rank_position, device_type, dwell_time_seconds)
-
post_reactions(reaction_id, viewer_id, post_id, reaction_type ∈ {like, comment, share}, ts)
Part A — Metrics
Define and justify metrics for "more social," specify unit-of-analysis, normalization, and guardrails.
Part B — Observational Causal Plan
Identify confounders and describe matching/stratification or inverse-propensity weighting to compare Friend vs Unconnected views while holding confounders constant. Specify standard error clustering and handling of repeated measures and multiple comparisons.
Part C — Experiment Design
Design an A/B test that adjusts ranking weights on relationship signals (Friend vs Unconnected). Define randomization unit, primary outcomes, guardrails, sample size/MDE, duration, pre-registration, and power methods. Address interference, novelty, and saturation. Propose a network-aware variant (e.g., post-level or graph-cluster randomization).
Part D — Value of Unconnected Beyond Engagement
Define and measure discovery value (new creators reached, diversity), long-term retention/session depth, and reshare-driven reach. Propose proxy measurements using only the two tables and call out additional logs needed.
Deliverables
a) A metric spec with formulas.
b) An observational analysis plan with controls and diagnostics.
c) An experiment design doc with randomization unit, power inputs, and stopping rules.
d) A KPI set quantifying incremental value of Unconnected content even if near-term engagement is lower.