{"blocks": [{"key": "4d260c8e", "text": "Scenario", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "81b892e8", "text": "Assessing response quality probabilities for chatbot/LLM outputs.", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "c657353d", "text": "Question", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "88913ee7", "text": "A bot produces a good response with probability x. Given the first three responses were good, what is the probability the fourth will be good? 2. One LLM shows 70% good responses and another 80%. Perform a hypothesis test to determine if the difference is statistically significant and interpret the result.", "type": "unordered-list-item", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "ace2a206", "text": "Hints", "type": "header-two", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}, {"key": "1a18c235", "text": "Assume independence for Q1; for Q2 use two-proportion z-test, report test statistic and p-value.", "type": "unstyled", "depth": 0, "inlineStyleRanges": [], "entityRanges": [], "data": {}}], "entityMap": {}}