LLM Serving, Inference Scaling, KV Cache, and Latency-Cost Tradeoffs — Tech Interview Concept | PracHub