You work on evaluating Waymo’s driving simulation.
You have:
-
Real-world (logged) driving data
collected on-road.
-
Simulated driving data
generated by the simulator for similar scenarios.
The simulator will be used to evaluate autonomy performance (e.g., collision risk, comfort, rule compliance), so you must determine whether the simulation is realistic enough to be trusted for performance evaluation.
Task
Design a data-driven validation framework to answer:
-
Is the simulated data distribution close to real-world data?
-
Does realism hold across important scenario slices
(e.g., intersections, merges, pedestrians, weather, rare/long-tail events)?
-
What metrics and statistical tests
would you use to quantify realism?
-
How would you decide pass/fail thresholds
and handle the fact that the real world contains rare but critical events?
-
If simulation is not realistic, how do you diagnose and prioritize fixes?
Assumptions (you may make reasonable ones)
-
Both datasets include time series for ego + nearby agents (positions, velocities, headings), map context, and event labels (e.g., near-miss, collision) where available.
-
Real and simulated runs can be matched by scenario type but are not necessarily one-to-one identical.