This question evaluates competency in ML system design and production engineering—covering model deployment and versioning, safe rollouts and rollbacks, monitoring of service health, data quality/drift, model performance and business metrics, and latency optimization for low-latency inference.

You have trained a fraud detection model and need to productionize it.
The model is deployed behind an online API, but it is missing a strict latency requirement: p99 latency < 50 ms.
Login required