Implement decoder-only GPT-style transformer | Amazon Interview Question