Cold-Food Complaints: End-to-End Analysis and Action Plan
You are the data scientist at a food-delivery marketplace that is seeing an increase in "cold food" complaints. Design an end-to-end analytical and operational plan. Do not propose any product/app UI changes; focus on measurement, diagnostics, experimentation, and ops.
Assume today is 2025-09-01.
Requirements
-
Primary Metric and Guardrails
-
Define exactly one primary metric and at least two guardrails.
-
Provide precise formulas: numerator, denominator, inclusion/exclusion rules (completed deliveries only; count only the first cold complaint within 24 hours per order; exclude re-deliveries and cancellations), time window, and aggregation grain.
-
Justify why your primary metric is superior to alternatives (e.g., refund rate, mean drop-off temperature, CSAT).
-
Slices and Covariates
-
Enumerate key slices and covariates to analyze (e.g., city, weather, cuisine, container type, distance, elevation, pickup wait, drop-off wait, courier equipment such as insulated bag, batching, time-of-day, weekend/holiday, restaurant prep SLAs).
-
Diagnostic Plan and Driver Attribution
-
Separate kitchen- vs courier- vs handoff-driven cooling using stage-duration breakdowns and appropriate models.
-
Describe controls for confounding (seasonality, surge, rain) and small-sample shrinkage.
-
RCT: Insulated Bags for Couriers
-
Propose an experiment to provide couriers with insulated bags.
-
State randomization unit, stratification, sample-size/power back-of-the-envelope with assumptions, duration, primary and secondary metrics, and spillover/contamination risks.
-
Quasi-Experiment if RCT Is Infeasible
-
Propose a quasi-experiment (e.g., difference-in-differences on staggered rollouts, or an instrumental variable approach).
-
State identifying assumptions and two falsification checks.
-
Operational Changes (No UI Changes)
-
Recommend 3–5 operational changes (e.g., batching limits, pickup queue prioritization, courier incentives for swift handoff, restaurant packaging standards).
-
Explain how you would measure impact at 2 weeks vs 8 weeks, including ROI and operational cost trade-offs.