Explain KV cache in Transformer inference | OpenAI Interview Question