Low-Latency/Batch Inference and GPU Resource Management — Tech Interview Concept | PracHub