Design a feature store with CI/CD and reliability

Q: Design a feature store with CI/CD and reliability

This question evaluates a candidate's competencies in designing scalable, low-latency feature stores and end-to-end ML infrastructure, covering feature ingestion, point-in-time correctness, serving APIs, CI/CD for feature pipelines, metadata/lineage, and reliability engineering.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

System Design: Feature Store for Offline Training and Low‑Latency Online Inference

Context

You are designing a feature store to support machine learning models that require:

Offline training on historical data.
Low-latency online inference at high QPS.

Assume an internet-scale social platform with entities such as user, item/content, and community. Models include recommendations, feed ranking, and abuse/spam detection.

Requirements

Specify and design the following:

Functional and non-functional requirements (SLOs, scale assumptions).
Feature ingestion for batch and streaming data.
Computation and storage layers (offline/online) with point-in-time correctness.
Feature serving APIs for training data generation and online inference.
Offline–online consistency strategy.
Freshness SLAs, backfills, and bootstrap flows.
Lineage/metadata, governance, and access control.
Deployment architecture and environments.
CI/CD for feature definitions and pipelines.
Testing strategies (unit, integration, end-to-end, data validation).
Caching layers and eviction policies.
Reliability/SLOs and fault tolerance.
Monitoring/alerting and rollback processes.
Cost and rate-limiting considerations.
Component diagrams and justified technology choices.

Design a feature store with CI/CD and reliability

System Design: Feature Store for Offline Training and Low‑Latency Online Inference

Context

Requirements

Solution

Comments (0)

Design a feature store with CI/CD and reliability

Overview

System Design: Feature Store for Offline Training and Low‑Latency Online Inference

Context

Requirements

Solution

Comments (0)