This question evaluates a candidate's understanding of integrating large language models into interactive agents, including prompt design, game-state summarization and serialization, constrained action parsing, retry and fallback strategies, context-window management, logging and experiment tracing, testing, and debugging of failure modes.
You are given a terminal-based text game and a several-hundred-line Python baseline that uses hand-written heuristic rules to choose actions. Replace the heuristic agent with an agent that queries an LLM to decide the next action.
Describe how you would implement this in Python so that the system is robust and debuggable. Your answer should cover: