Goal
Design a principled weighting scheme for impression-level actions to construct a socialness score
S = w_like · Likes + w_comment · Comments + w_share · Shares
that predicts a downstream binary outcome Y (e.g., viewer sends a message or follows the author within 7 days).
Requirements
-
Estimate non-negative weights under a normalization constraint (sum to 1), so that S best predicts Y.
-
Provide uncertainty for each weight (confidence intervals) and tests for weight differences.
-
Handle multicollinearity between actions and sparsity (rare actions).
-
Validate out-of-sample via cross-validation and sensitivity to alternative targets (e.g., D+1 retention).
-
Outline one frequentist method (e.g., constrained logistic regression with standardized covariates and a probability-to-weight mapping) and one Bayesian alternative (e.g., hierarchical prior with Dirichlet or log-normal/softmax constraints), and explain how to compare them.
-
State assumptions and how violations would bias S.
Context and Notation
-
Data: impressions i = 1,...,N with action features X_i = (Likes_i, Comments_i, Shares_i), typically sparse and correlated.
-
Target: Y_i ∈ {0,1} indicating whether a downstream event occurred within a fixed window.
-
Objective: choose w = (w_like, w_comment, w_share), w_j ≥ 0, ∑_j w_j = 1, so that S_i = wᵀ X_i is maximally predictive of Y.
-
Practical issues: different action scales, rare actions, potential leakage (ensure action window precedes Y window), repeated users/authors.