Scenario
You’re building an ML platform component that serves a model to predict the likelihood that a user will comment on a given post.
The interviewer says you can treat the model as a black box (you don’t need to pick a specific architecture); focus on ML infrastructure: feature pipelines, feature store, training/serving, inference, and production concerns.
Goals
Design an end-to-end system that supports:
-
Offline training data generation and model training
-
Online inference (real-time scoring) for product surfaces (e.g., feed ranking, notification candidate scoring)
-
A feature store strategy (offline + online)
-
Monitoring, logging, and iteration
Requirements & constraints (you may make reasonable assumptions)
-
High QPS online scoring (potentially tens of thousands/sec)
-
P95 latency budget for scoring: e.g., 50–150 ms end-to-end (state your assumption)
-
Freshness: some features need near-real-time updates (seconds to minutes)
-
Avoid training/serving feature skew
-
Handle cold start (new users/posts)
-
Support A/B testing and safe rollout
Deliverables
Explain:
-
Data sources and event logging
-
Feature engineering and feature store design
-
Training pipeline and dataset versioning
-
Online inference architecture (batch vs real-time, caching, fallbacks)
-
Monitoring (data + model), retraining triggers, and reliability considerations