This question evaluates competency in domain adaptation, distribution-shift detection, sample-efficient evaluation design, uncertainty quantification, and risk-based go/no-go decision making within machine learning applied to autonomous driving.
You have abundant labeled autonomous-driving data from Beijing and have already built an evaluation system there. Now the company wants to assess performance in Guangzhou, but does not want to rebuild the evaluation framework from scratch. You are allowed to collect only a small amount of Guangzhou data.
How would you evaluate whether the autonomous-driving system is likely to perform well in Guangzhou?
Your answer should address:
Follow-up: suppose Guangzhou contains scenario types that differ materially from Beijing, for example road topology, weather, traffic-agent mix, signage, or local driving behavior. How should that change your data-collection strategy?