Design a response-ranking ML system

Q: Design a response-ranking ML system

This is a ML System Design interview question from OpenAI for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

System Design: Ranking Candidate Text Responses to Maximize User Satisfaction

You are designing an end-to-end machine learning system that, given a user query (possibly multi-turn context), ranks multiple candidate text responses and selects the best one to maximize user satisfaction.

Specify and justify the following:

Problem formulation and objective
- Define the prediction task and training objective.
- Identify labels or proxies for user satisfaction.
Data sources and labeling strategy
- Implicit feedback (e.g., clicks, dwell, conversation continuation).
- Explicit human ratings or preference labels.
- How to handle bias and noise in logs.
Model choice
- Pairwise vs listwise ranking, reward modeling, and/or RL from feedback.
- How to combine safety and helpfulness objectives.
Offline training pipeline and feature/embedding generation
- Data processing, feature sets, and embedding strategies.
- Negative sampling and hard-negative mining.
Evaluation metrics
- Ranking metrics (e.g., NDCG, MRR, pairwise accuracy).
- Calibration and safety metrics.
Online inference architecture
- Latency budgets, caching, and candidate generation.
- Two-stage ranking (coarse-to-fine) and failover behavior.
Experimentation plan
- A/B testing, interleaving, and counterfactual evaluation.
Safety and alignment measures
- Toxicity filters, guardrails, and policy enforcement.
Bias and privacy controls
- Fairness metrics, data minimization, and privacy-preserving training.
Monitoring and alerting
- Quality, reliability, and drift detection.
Retraining cadence
- Data refresh, active learning, and governance.
Cost and reliability trade-offs
- Model size, serving hardware, and graceful degradation.

Provide a concise, high-level architecture description in words that ties the components together.

Design a response-ranking ML system

System Design: Ranking Candidate Text Responses to Maximize User Satisfaction

Solution

Comments (0)