PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Meta

Design batch and streaming ETL architecture

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data engineer's ability to design scalable batch and streaming ETL architectures, including competencies in ingestion patterns, messaging and storage layers, schema modeling, partitioning and clustering, deduplication and late-event handling, aggregation strategies, and data quality and SLA enforcement.

  • hard
  • Meta
  • System Design
  • Data Engineer

Design batch and streaming ETL architecture

Company: Meta

Role: Data Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Design an end-to-end data platform that supports both daily batch processing and near-real-time streaming for product analytics. Specify the ingestion sources and formats; schema design for raw, staging, and modeled layers; table partitioning and clustering; strategies for idempotency, deduplication, and handling late/out-of-order events; update patterns (append-only vs upsert/merge), slowly changing dimensions (SCD1/SCD 2), and backfills; orchestration, dependency management, and failure recovery; aggregation strategies for daily/hourly/rolling-window metrics; data quality checks and SLAs; and trade-offs between latency, cost, and complexity.

Quick Answer: This question evaluates a data engineer's ability to design scalable batch and streaming ETL architectures, including competencies in ingestion patterns, messaging and storage layers, schema modeling, partitioning and clustering, deduplication and late-event handling, aggregation strategies, and data quality and SLA enforcement.

Related Interview Questions

  • Design an Online Game Leaderboard - Meta (hard)
  • Design an Instagram-like Media Feed - Meta (medium)
  • Design an Online Judge and Live Comments - Meta (medium)
  • Design an Instagram-like platform - Meta (medium)
  • Design a Coding Contest Platform - Meta (medium)
Meta logo
Meta
Jul 15, 2025, 12:00 AM
Data Engineer
Onsite
System Design
32
0

System Design: End-to-End Data Platform for Product Analytics (Batch + Near-Real-Time)

Context

Design a scalable data platform for a large consumer product with web and mobile clients. The platform must power daily product analytics (e.g., DAU/MAU, retention, funnels, cohorts, experiments) and near-real-time dashboards (<5 minutes end-to-end) while supporting backfills and rigorous data quality.

Assume tens to hundreds of millions of daily events and multiple upstream systems (client telemetry, backend logs, relational OLTP for user/account, and third-party data). You may reference common technologies (e.g., Kafka, Flink/Spark, object store + lakehouse table format, a cloud data warehouse), but focus on design choices and trade-offs.

Requirements

  1. Ingestion sources and formats
  • Identify sources (client events, backend logs, CDC from OLTP, third-party feeds) and wire formats (JSON/Protobuf/Avro on the wire; Parquet/Delta/Hudi/Iceberg in storage).
  1. Storage and compute architecture
  • Describe the messaging/streaming layer, raw landing, staging, and modeled layers, and the batch/streaming compute engines.
  1. Schema design by layer
  • Define schemas for raw ("bronze"), deduped/cleaned ("silver"), and modeled analytics ("gold"). Include a canonical event envelope, dimensions (users/devices/products/experiments), and fact tables (events, sessions, conversions).
  1. Table partitioning and clustering
  • Propose partitioning and clustering/sorting for each major table to optimize scan cost and latency.
  1. Idempotency, deduplication, and late/out-of-order events
  • Specify unique keys, event-time vs ingestion-time, watermarking, allowed lateness, and how to reconcile late data into aggregates.
  1. Update patterns and history
  • State which layers are append-only vs upsert/merge. Explain SCD1 vs SCD2 for dimensions, identity resolution (anonymous → logged-in), and how you will run backfills safely.
  1. Orchestration, dependencies, and failure recovery
  • Describe scheduling, dependency management, retries, checkpointing, and exactly-once/at-least-once guarantees.
  1. Aggregations for daily/hourly/rolling metrics
  • Define how to compute daily/hourly windows and rolling windows (e.g., 7/28-day active, retention, funnel steps), both in streaming and batch.
  1. Data quality and SLAs
  • Outline schema enforcement, validation tests, anomaly detection, freshness/completeness SLAs, and alerting.
  1. Trade-offs
  • Discuss latency vs cost vs complexity; lambda vs kappa patterns; when to pre-aggregate vs compute on read; and real-time store choices.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Meta•More Data Engineer•Meta Data Engineer•Meta System Design•Data Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.