Design a low-latency ML inference API | Anthropic Interview Question