Design a feature store with CI/CD and reliability
Company: Reddit
Role: Software Engineer
Category: ML System Design
Difficulty: hard
Interview Round: Technical Screen
Design a feature store to serve ML models for both offline training and low-latency online inference. Specify requirements; feature ingestion (batch and streaming), computation and storage layers, feature serving APIs, point-in-time correctness, offline–online consistency, freshness and backfills, lineage/metadata, governance and access control. Detail deployment architecture and environments, CI/CD for feature definitions and pipelines, testing strategies (unit/integration/e2e, data validation), caching layers and eviction policies, reliability/SLOs and fault tolerance, monitoring/alerting and rollback, and cost/rate-limiting considerations. Provide component diagrams and justify technology choices.
Quick Answer: This question evaluates a candidate's competencies in designing scalable, low-latency feature stores and end-to-end ML infrastructure, covering feature ingestion, point-in-time correctness, serving APIs, CI/CD for feature pipelines, metadata/lineage, and reliability engineering.