Design a secure ML data platform

Q: Design a secure ML data platform

This is a ML System Design interview question from Meta for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

Question

System Design: Secure, Ethical, Multi‑Tenant ML Data and Inference Platform

Context

Design a cloud-based ML platform used by multiple internal product teams. The platform must cover data ingestion, storage, training, and online/offline inference, while meeting strict security, privacy, and ethical standards. Assume:

Multiple tenants (teams) share infrastructure but require strong isolation.
Mix of structured/unstructured data, including PII and sensitive content.
Both batch (training/offline scoring) and real-time (online inference) workloads.

Requirements

Multi-tenant isolation, data classification, and PII handling
- Isolation across compute, storage, and network.
- Data classification taxonomy and enforcement.
- PII handling: tokenization/de-identification, data minimization, retention/deletion.
Secrets, network, and access controls
- Secret management with automated key rotation.
- Network segmentation and egress controls.
- Least-privilege access (RBAC/ABAC), short-lived credentials.
Model governance
- Approval gates in CI/CD, model registry and lineage.
- Red‑teaming, bias/safety/abuse audits.
- Rollback and kill‑switch plans.
Compliance and audit logging
- High-level alignment with SOC 2 and GDPR/CCPA.
- Tamper‑evident audit logging and retention.
Reliability, cost, and monitoring
- SLOs for training and serving.
- Cost controls and quotas.
- Monitoring for data/model drift and misuse.
Architecture and trade-offs
- Provide an end-to-end architecture diagram.
- Discuss key trade-offs of the design.

Design a secure ML data platform

System Design: Secure, Ethical, Multi‑Tenant ML Data and Inference Platform

Context

Requirements

Solution (Locked)

Comments (0)