Real-Time Grid-ETA System Design
You are tasked with designing a real-time system that maintains the remaining ETA for every driver currently located within each cell of a city grid. You must use only the following data sources:
-
GPS pings from driver devices
-
Trip events (e.g., trip_start, trip_end, destination provided at trip start)
No external maps, POI data, or manual features are allowed. Assume:
-
The city is pre-partitioned into a fixed grid (e.g., S2/H3 or a rectangular grid) with a known cell_id function latlon → cell.
-
For an active trip, the destination (lat, lon) is known at trip_start.
-
GPS pings arrive at ~1 Hz with fields including timestamp and optional accuracy.
Design the system and describe, step-by-step, how you will:
-
Ingest and partition GPS pings and trip events.
-
Denoise GPS signals and correct drift to robustly localize drivers within grid cells.
-
Compute and continuously update each driver's remaining ETA.
-
Select and train a model suitable for online serving.
-
Perform online feature computation strictly from the approved data.
-
Meet low-latency, high-throughput SLAs.
-
Monitor, alert, and rollback models.
-
Evaluate offline and online.
Be specific about:
-
Algorithms for GPS error reduction (e.g., filtering, map matching)
-
Data schemas
-
State stores and keys
-
Windowing strategy and update frequency
-
Failure handling
-
Validation methodology