Scenario
DoorDash seeks to quantify and forecast the “inflation gap”: the difference between on-platform menu prices and in-store prices for the same items at the same merchant location and time.
Assumptions to make the problem concrete:
-
Compare base menu prices only (pre-tax, excluding delivery/service fees, tips, coupons, and optional add-ons).
-
Match identical items by merchant-location, item name/size/variant.
-
Produce weekly/monthly metrics, aggregated across stores/items with appropriate weights.
Task
Outline an end-to-end analysis plan to:
-
Measure the current on-platform vs in-store price gap.
-
Forecast the gap over the next 3–6 months.
-
Quantify uncertainty (confidence intervals) and derive sample-size/power requirements when no A/B test is feasible.
Include
-
Data sources and collection strategy (platform menu data, in-store ground truth, metadata, exogenous signals).
-
Robust item matching for identical products (text normalization, variants, units, validation).
-
Metric definitions (per-item gap, aggregation, weighting, price index choices) and data cleaning.
-
Time-series modeling choices for forecasting, with exogenous regressors.
-
Methods for uncertainty (bootstrapped CIs, clustered/time-series considerations).
-
Power analysis/MDE assumptions for observational or pre/post designs without randomized tests.
Deliverables
-
A clear, step-by-step plan with key formulas, assumptions, and guardrails.
-
Notes on pitfalls (promotions, seasonality, missing data, selection bias) and validation.