Experiment Design: Outlook vs Gmail Deliverability to a Specific Enterprise Domain
Context
You need to determine whether sending from Outlook achieves higher deliverability than sending from Gmail when emailing a specific enterprise domain. Assume you control both sending setups and a pool of seed inboxes under the enterprise domain (including multiple subdomains). The system can instrument authentication, capture bounces, poll mailboxes via API/IMAP, and log events with synchronized clocks.
Task
Design a rigorous experiment that includes:
-
Hypotheses and estimands.
-
Experimental unit and randomization scheme that balances time-of-day and content across providers.
-
A two-stage sequential design with an interim at 500 sends (baseline failure rate unknown).
-
Instrumentation details: SPF, DKIM, DMARC alignment; return-path domain; seed inboxes across subdomains; per-message signed tokens; server-side web beacons.
-
Metrics: primary (delivered-to-inbox within 5 minutes of send); secondary (time-to-first-inbox, spam-folder rate, hard-bounce rate).
-
Analysis plan using either:
-
A difference-in-proportions test with continuity correction and multiplicity control, or
-
A Bayesian beta-binomial with skeptical priors, including decision thresholds and stopping rules.
-
Plans to detect and mitigate confounders (content drift, throttling, out-of-office bursts, holiday effects) and to assess heterogeneity by recipient subdomain.
-
Procedures to validate assumptions and generalize results to future campaigns.