Design a chatbot over structured and unstructured data
Company: Google
Role: Machine Learning Engineer
Category: ML System Design
Difficulty: medium
Interview Round: Onsite
Design a chatbot that can answer user questions using both:
- **Structured data** (e.g., relational tables such as orders, products, pricing, user accounts), and
- **Unstructured data** (e.g., documents, knowledge base articles, FAQs, PDFs, internal wikis).
The chatbot should:
- Provide accurate, grounded answers
- Support follow-up questions (multi-turn)
- Cite sources when possible
- Respect permissions (users can only access authorized data)
Discuss:
1. High-level architecture and main components
2. Data ingestion and indexing strategy for both data types
3. Retrieval strategy (including when to query SQL vs retrieve documents)
4. LLM prompting / tool-calling approach
5. Safety, privacy, and access control
6. Evaluation (offline + online) and monitoring
7. Latency/cost considerations and scalability
Quick Answer: This question evaluates a machine learning engineer's ability to design end-to-end systems that integrate structured and unstructured data, testing competencies in data architecture, information retrieval, LLM orchestration, access control, privacy, and monitoring within the ML system design domain.