Design a CI/CD system with live log streaming
Company: OpenAI
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
## System Design Prompt: CI/CD Platform (shell-script jobs)
Design a CI/CD system that can:
1. Allow users to define **pipelines** consisting of multiple **jobs**, where each job is just a **shell script**.
2. Trigger pipeline runs via **git events** (push/PR) and also manually.
3. Schedule jobs onto a fleet of workers, respecting job dependencies (DAG) and basic resource constraints (e.g., concurrency limits).
4. Persist run history: pipeline status, job status, timestamps, exit codes.
5. Store and retrieve job logs.
6. **Stream job output logs in near real-time** to a “status service”/UI while the job is running.
### Out of scope
- Do not assume containers/images (no container registry, no Docker/K8s specifics).
### What to cover
- High-level architecture and core services.
- Data model for pipelines/runs/jobs.
- Execution model (how scripts run on workers), retries, idempotency.
- Log ingestion, storage, and real-time streaming to users.
- Scalability, reliability, and security considerations (secrets, isolation).
Quick Answer: This question evaluates a candidate's ability to design scalable CI/CD architectures including shell-script job orchestration, DAG-aware scheduling, run history and log persistence, near-real-time log ingestion/streaming, and operational concerns such as retries, idempotency, resource constraints and secrets/isolation.