Design an end-to-end ETA (Estimated Time of Arrival) system for a maps / ride-hailing / delivery product.
Assume users request an ETA for a trip from an origin to a destination (possibly with waypoints). The system must return an ETA in real time.
Cover the following:
-
Product definition & requirements
-
Who are the users (rider/driver/courier/customer)?
-
Latency/throughput targets and how frequently ETA should update.
-
What does “good ETA” mean (accuracy vs stability vs calibration)?
-
Data and labeling
-
What raw data sources you would use (GPS pings, road graph, traffic, weather, historical trips, incidents, etc.).
-
How to define the training label (actual travel time) and handle censoring (canceled trips, detours, pauses).
-
Modeling approach
-
Baselines and incremental modeling (rules → regression/GBDT → sequence models).
-
Feature design (time-of-day, road segments, traffic states, driver behavior, route choice).
-
How to represent a route (segment-level vs whole-trip).
-
Evaluation
-
Offline metrics (e.g., MAE/MAPE, quantiles, calibration, tail errors).
-
Online metrics and guardrails (user trust, cancellation rate, conversion).
-
Slice analysis (rush hour, city centers, long trips, sparse areas).
-
Serving & system design
-
Real-time feature computation, caching, and fallbacks.
-
Model updates, monitoring, drift detection, and alerting.
-
Key pitfalls
-
Data leakage, feedback loops (ETA affects route choice), selection bias (only completed trips), and non-stationarity.
Provide a concrete proposal and justify tradeoffs.