Group Health Metrics and Experiment Design for Threaded Comments
Context
You are a Data Scientist working on a platform with Groups that range from small hobby clubs to very large public communities. Leadership wants a single, comparable "Group health" score across all Groups. Engineering plans to launch Reddit‑style threaded comments to encourage deeper discussions. You must:
-
Propose health metrics (and define them clearly).
-
Compare performance between small and large Groups fairly.
-
Design a test to validate whether threaded comments increase deep interaction.
Questions
-
Which metrics would you track to measure Group health?
-
How would you compare performance between large and small Groups so the comparison is fair and scale‑normalised?
-
If Reddit‑style comment threads are introduced, how would you test for interaction lift (e.g., A/B design, retention, depth of discussion)?