Design a ChatGPT-like serving system
Company: Microsoft
Role: Software Engineer
Category: System Design
Difficulty: nan
Interview Round: Technical Screen
Quick Answer: This question evaluates expertise in designing scalable machine-learning inference systems, covering chat-completion architecture, GPU capacity planning for large transformers, and stateful KV-cache design, layout, latency, and consistency considerations.