You have built an autonomous-driving evaluation system using a large amount of labeled data from Beijing. Now the company wants to operate in Guangzhou. You do not want to rebuild the entire evaluation pipeline from scratch, and you can only collect a small amount of Guangzhou data.
How would you evaluate whether the autonomous-driving system is likely to perform well in Guangzhou under this limited-data setting? Discuss:
-
how to assess whether the Beijing-based evaluation system transfers to Guangzhou,
-
what kinds of distribution shift you would look for,
-
how to combine large Beijing data with small Guangzhou data,
-
how to quantify uncertainty and decide whether the evidence is sufficient.
Follow-up: if Guangzhou contains important scenarios that are rare or absent in Beijing—for example different road topology, scooter density, weather, driving behavior, or map quality—how should that change your data-collection strategy? Be explicit about what to sample and how to prioritize edge cases.