This question evaluates a data scientist's competency in defining product success metrics, selecting guardrail metrics, and designing A/B experiments for a customer-service chatbot, covering metric definition, statistical power, randomization, and monitoring.

An e-commerce company has deployed a customer-service chatbot ("euro-chat") to handle B2C support inquiries across web/app chat. The bot can answer questions and escalate to human agents when needed.
Define how you would measure success for euro-chat, what guardrail metrics you would track, and how you would design an experiment to test whether the chatbot improves customer experience.
Login required