Prove an Automated Package-Allocation System Outperforms Manual Baseline
Context
You work in a large last‑mile logistics network evaluating a new automated package‑allocation (dispatch) system versus the current manual process. Daily volume is ~100,000 orders. The baseline on‑time delivery rate is 92%.
Task
Design a rigorous experiment and analysis plan that:
-
Addresses interference and spillovers
-
Propose and justify a randomization unit (e.g., station‑hour switchback or courier‑level cluster randomization).
-
Explain operational controls to limit contamination.
-
Defines metrics and decision criteria
-
Primary: on‑time delivery rate.
-
Guardrails: SLA breaches, courier overtime, customer contacts, fairness (Gini of work allocation).
-
Set superiority and non‑inferiority thresholds (e.g., uplift ≥ 0.6 pp for primary, non‑inferior guardrails).
-
Pre‑specifies the analysis
-
Intention‑to‑treat (ITT) as primary; note any per‑protocol sensitivity.
-
Variance reduction (CUPED/covariate adjustment).
-
Heterogeneity by zip‑code density (urban vs suburban/rural).
-
Inference approach and handling of clustering.
-
Includes power/MDE calculations
-
Use baseline on‑time = 92%, 100k orders/day.
-
State intra‑cluster correlation assumptions.
-
Provide duration estimates to detect a 0.6 pp uplift.
-
Covers operations and governance
-
Ramp plan, monitoring, and rollback criteria.
-
Diff‑in‑diff fallback if perfect randomization isn’t feasible.
-
Anti‑gaming/contamination controls (e.g., inventory locking, shadow assignments).