System Design: Scenario-Based Speaking Practice for a Multilingual Learning App
Goal
Design a scenario-based speaking practice feature where users select real-life scenarios (e.g., ordering food, job interviews) and practice spoken dialogs. Address both product and engineering aspects for a production-scale system.
Requirements
-
User flows: scenario discovery, selection, in-session experience, feedback, and review.
-
Content modeling: scenarios, roles (user/agent), dialog turns, prompts, evaluation rubrics, and metadata.
-
Session orchestration: turn-taking, real-time feedback, error recovery, and session state.
-
Audio pipeline: capture, streaming, ASR/NLU, TTS/LLM, latency targets, barge-in, and offline fallback.
-
Personalization: level selection, adaptive difficulty, and content recommendations.
-
Progress tracking: learning metrics, mastery signals, and reporting.
-
Content localization: multi-language support, regional variants, and voice selection.
-
Architecture: services, storage choices, APIs, rate limiting, and cost-aware scaling to millions of sessions.
-
Privacy and security: data handling, compliance, abuse prevention.
-
Experimentation and rollout: adding new scenarios safely and measuring impact.