Diagnose Transformer training and inference bugs | OpenAI Interview Question