This question evaluates experimental design and statistical reasoning for real-time calling services, focusing on two-sided treatment exposure, interference, metric definition, clustering/repeat-measures handling, and power/sample-size computation.

You join the Calling organization. A PM proposes enabling an adaptive codec that activates on unstable networks to reduce call drops. The codec is two-sided: it can only run when both participants support it. Your task is to design and size an A/B test that accounts for two-sided exposure and network interference.
Assume we are testing on 1:1 calls (audio and/or video). Unless stated, exclude employees/test accounts and spam/abuse traffic. Assume the baseline drop rate below refers to calls under unstable network conditions (the codec's target population).
(a) Choose the randomization unit (caller-level, callee-level, dyad-level, or geo cluster) and justify it under two-sided exposure/interference.
(b) Define primary and guardrail metrics precisely, and specify exposure logic (feature on only when both sides are treated vs when caller alone is treated).
(c) Outline a ramp plan and spillover checks.
(d) Handle non-independence (repeat callers) and seasonality.
(e) Provide a power/SST back-of-the-envelope. Assume:
Finally, specify the estimand and estimator (e.g., ITT at caller-level with cluster-robust SEs), and how you will diagnose interference (e.g., cross-arm caller–callee edges).
Login required