Design an A/B Testing Platform (Architecture + Experiment Science)
Context
You are designing an A/B testing platform for a large-scale consumer web/mobile product. The platform must support millions of users, low-latency assignment, privacy compliance, and both real-time and batch analytics. Multiple experiments can run concurrently across different product surfaces.
Requirements
Design the platform end-to-end to support:
-
Experiment definition and configuration (namespaces/layers, eligibility/targeting, traffic allocation, variants, start/stop).
-
Deterministic randomization and bucketing with sticky assignment and unit consistency across devices/sessions.
-
Exposure logging and event telemetry with deduplication and identity stitching.
-
Metric computation (batch + streaming), including definitions for conversions, retention, ratios, quantiles, and experiment-scoped windows.
-
Incremental rollout, governance, and guardrails (e.g., SRM, kill switches, safety metrics).
-
Bias avoidance and experiment hygiene (triggering, intent-to-treat, overlap management, AA tests).
-
Statistical analysis and diagnostics (power, variance reduction, CIs/p-values, sequential monitoring, multiple testing, cluster-robust errors, diagnostics dashboards).
In your answer, describe:
-
Bucketing and traffic allocation
-
Unit of randomization and unit consistency
-
Incremental rollout and guardrails
-
Bias avoidance practices
-
Statistical analysis and diagnostics
-
A high-level architecture and data flow