You are considering launching Group Video Calls. Answer all parts precisely; justify choices with pros/cons and formulas where relevant.
-
Clarify CTR comparison: For a new entry-point button, should the primary CTR be (a) Treatment vs Control at the same time (between-subjects), or (b) Treatment users’ Month 1 vs their own Month 6 baseline (within-subjects)? Specify when each is valid, risks of bias (seasonality, maturation), and propose a preferred estimator. Provide formulas for: (i) simple difference vs control, (ii) within-user pre/post, and (iii) difference-in-differences combining both. State the exact windows you’d use (e.g., M1 = 2025-09-01 to 2025-09-30, M6 = 2025-03-01 to 2025-03-31) and how you’d handle users without a full baseline.
-
Experiment design: Propose a full A/B test (or cluster test) for Group Calls considering interference/network effects (users call each other). Specify randomization unit (user, household, call-graph clusters, geography), allocation, stratification (e.g., country, device), and ramp plan. Define primary success metric(s) and why (e.g., incremental video-call minutes per DAU, completed-call rate), and guardrails (e.g., app crashes, latency, churn). Provide decision thresholds and a stopping plan (power, MDE, alpha, sequential corrections).
-
Interference handling: Calls involve multiple users who might land in different variants. Describe a design that avoids contamination (e.g., cluster-by-ego-network, geo switchback). Explain how you’ll attribute a group call to a variant, and how you’ll analyze spillovers; include at least one robustness check (e.g., exposure reweighting, CUPED, cluster-robust SEs).
-
Launch decision resources: Besides the experiment, list external/internal resources you’d consult before greenlighting group calls (market research, competitor benchmarks, support tickets, qualitative UX studies, capacity/SRE constraints, cost models) and explain what each would change in your threshold for launch.
-
Set participant limit: Propose a data-driven method to choose a max participants-per-call limit (e.g., 4/8/16) under infra constraints. Outline how you’d simulate expected concurrency using historical call distributions, model QoS (latency, failure rate) as a function of N, and run a multi-arm experiment to pick the limit. Include guardrails and rollback criteria.
-
Edge cases: Explicitly cover novelty effects, day-of-week effects, France-only vs global rollouts, and how you’d ensure analysis uses the correct absolute dates (e.g., yesterday = 2025-08-31) and local time handling.