Diagnose Transformer training and inference bugs | OpenAI Coding Question