This question evaluates a candidate's ability to architect scalable, reliable job scheduling and ETL orchestration systems, testing competencies in distributed systems, scheduling, fault tolerance, observability, multi-tenancy, and operational metadata management.
Design a Job Scheduler + ETL pipeline system.
The system should allow users (or internal services) to:
Explain key components (API, scheduler, queue, workers, metadata DB), data model, scaling strategy, and how you’d use/load-balance caches/queues. Call out major tradeoffs (exactly-once vs at-least-once, latency vs throughput).