Experiment Design: Thermal Bags for Couriers to Reduce Cold-Food Refunds
Background
We want to evaluate whether providing couriers with thermal bags reduces customer refunds attributable to cold food. The platform operates across multiple cities with couriers who may deliver in multiple zones and for many stores. Some couriers may not consistently use the bag even if provided.
Assumptions (minimal):
-
We can assign couriers to receive bags and instrument usage via in-app prompts/incentives.
-
We can attribute refunds to "cold food" via reason codes and support notes.
-
We can log delivery/order covariates (distance, cuisine, temperature, store ID, ETA, etc.).
-
We can collect photo audits or telemetry indicating bag usage on an order.
Task
Design a robust experiment that covers:
(a) Randomization unit (courier vs. store vs. zone), with justification considering network effects and contamination (e.g., couriers serving multiple zones; stores serving both arms).
(b) Outcome definitions: primary metric (refund cost per order attributable to cold food) and guardrails (delivery ETA accuracy, contact rate, re-order rate, courier supply hours).
(c) Stratification/clustered randomization across cities and peak vs. off-peak, and how to handle seasonality/holidays (e.g., staggered rollouts and/or difference-in-differences on pre-period trends).
(d) Noncompliance and measurement: when bags are not used, how we measure usage (telemetry/audit photos), and an encouragement design or IV strategy for estimating LATE.
(e) Analysis plan: CUPED/covariate adjustment (distance, cuisine, temperature, store), heterogeneity by cuisine and distance deciles, variance estimation with cluster-robust SEs, sequential monitoring rules, and pre-registered success thresholds.