How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

What difficulty level is this interview question?

This is a medium difficulty System Design question, commonly asked during Technical Screen rounds at Snowflake.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Snowflake during technical interviews.

Design a Cron Job Scheduler | Snowflake Interview Question

Q: Design a Cron Job Scheduler

This question evaluates skills in designing reliable distributed schedulers, including durable state modeling, concurrency and race-condition reasoning, idempotency, and fault-tolerant handoff semantics, and is commonly asked to assess a candidate's ability to guarantee correctness and coordination when multiple replicas must manage recurring job triggers. Categorized under system design and distributed systems with overlap into data modeling and consistency, it emphasizes practical application of architectural and operational trade-offs rather than purely theoretical concepts.

Design a distributed cron job scheduler — a service that triggers user-defined jobs on recurring, cron-style schedules. Assume worker (execution) capacity is effectively unlimited, so the design should focus on correct, reliable scheduling and state management rather than on autoscaling the compute that runs the jobs.

The system must support:

Recurring jobs defined by cron-style schedules (e.g. 0 */6 * * * = every 6 hours).
pause(job_id) — stop scheduling new runs for that job, but let any already-running executions finish normally.
resume(job_id) — allow future scheduled runs to resume.

Produce an end-to-end design covering: (a) the public API, (b) the data model, (c) the core scheduler loop that finds and fires due jobs, (d) correct pause/resume semantics including the race against in-flight scheduling, (e) safe operation with multiple scheduler instances, and (f) crash recovery and reliability (no lost or silently-dropped triggers).

Constraints & Assumptions

Worker capacity is effectively unlimited — never gate the design on "not enough workers." The hard problems live in scheduling, not execution.
Scale (assume, state your own if different): on the order of $10^6$ job definitions; tens of thousands of due jobs in the busiest minute. Schedules are mostly minute-granularity cron expressions.
Correctness bar: no lost triggers (a due, active job must run), and at-least-once delivery with strong effort to minimize duplicates. Jobs are not assumed idempotent by default, so call out where you rely on idempotency.
Availability: the scheduler must keep firing through single-instance crashes; multiple scheduler replicas run for HA.
Treat the actual job body as an opaque payload (e.g. an HTTP call or a queued task); you are designing the scheduler, not the job logic.

Clarifying Questions to Ask

What schedule granularity must we support — minute-level cron only, or down to seconds? Are arbitrary timezones / DST in scope, or is UTC-only acceptable?
On resume after a long pause (or after downtime), should we backfill the missed runs, or just resume from the next future occurrence?
Is at-least-once acceptable (rely on idempotent jobs / dedup) or do we need at-most-once for some job classes?
What's the expected mix: many jobs each firing rarely, or a few jobs firing very frequently? What is the busiest-minute fan-out?
What does a "run" hand off to — an internal task queue, an HTTP webhook, a Kubernetes Job? What retry and timeout policy is expected on failure?
How long must run history / audit data be retained?

What a Strong Answer Covers

Clean plane separation: API → metadata store → scheduler/dispatcher → durable queue → workers, and why a queue still helps even with unlimited workers (decoupling, buffering, reliability).
Data model with the fields that actually drive correctness: per-job status , next_run_at , last_scheduled_at , a concurrency token ( version or lease), and a separate run-history table with a unique idempotency key.
A concrete scheduler loop: query due jobs → claim safely → emit one durable run record → advance next_run_at — and the indexing that makes the due-scan cheap at $10^6$ jobs.
Correct multi-instance behavior: the exact mechanism that prevents two replicas from double-firing the same tick.
Precise pause/resume semantics , including an explicit treatment of the pause-vs-claim race and the backfill-on-resume decision.
Crash recovery & no-lost-trigger reasoning: a durable hand-off that survives a crash between the DB write and the enqueue, stale- RUNNING detection via heartbeat/lease timeout, retries, and where at-least-once forces worker idempotency.
Scaling levers and bottlenecks given unlimited workers (due-scan efficiency, DB write throughput, queue publish rate; partitioning / time-bucketing).
Observability: scheduler lag, runs created/min, duplicate-fire count, failure rate, queue depth, and audit logs for pause/resume.

Follow-up Questions

Walk through exactly what happens if a scheduler instance crashes after inserting the run record but before the run reaches the queue. How does your design guarantee the trigger is neither lost nor double-delivered?
A single job is configured * * * * * (every minute) but each run takes ~5 minutes. How do you handle overlap — skip, queue, or allow concurrent runs — and how does the model express that policy?
The metadata store becomes the throughput bottleneck at $10^6$ jobs. How do you partition or shard scheduling so replicas don't contend, while preserving the no-double-fire guarantee?
How would you support per-job timezones and survive a daylight-saving transition without dropping or doubling the 2 a.m. run?

The system must support:

Recurring jobs defined by cron-style schedules (e.g. 0 */6 * * * = every 6 hours).
pause(job_id) — stop scheduling new runs for that job, but let any already-running executions finish normally.
resume(job_id) — allow future scheduled runs to resume.

Constraints & Assumptions

Worker capacity is effectively unlimited — never gate the design on "not enough workers." The hard problems live in scheduling, not execution.
Scale (assume, state your own if different): on the order of $10^6$ job definitions; tens of thousands of due jobs in the busiest minute. Schedules are mostly minute-granularity cron expressions.
Correctness bar: no lost triggers (a due, active job must run), and at-least-once delivery with strong effort to minimize duplicates. Jobs are not assumed idempotent by default, so call out where you rely on idempotency.
Availability: the scheduler must keep firing through single-instance crashes; multiple scheduler replicas run for HA.
Treat the actual job body as an opaque payload (e.g. an HTTP call or a queued task); you are designing the scheduler, not the job logic.

Clarifying Questions to Ask

What schedule granularity must we support — minute-level cron only, or down to seconds? Are arbitrary timezones / DST in scope, or is UTC-only acceptable?
On resume after a long pause (or after downtime), should we backfill the missed runs, or just resume from the next future occurrence?
Is at-least-once acceptable (rely on idempotent jobs / dedup) or do we need at-most-once for some job classes?
What's the expected mix: many jobs each firing rarely, or a few jobs firing very frequently? What is the busiest-minute fan-out?
What does a "run" hand off to — an internal task queue, an HTTP webhook, a Kubernetes Job? What retry and timeout policy is expected on failure?
How long must run history / audit data be retained?

What a Strong Answer Covers

Clean plane separation: API → metadata store → scheduler/dispatcher → durable queue → workers, and why a queue still helps even with unlimited workers (decoupling, buffering, reliability).
Data model with the fields that actually drive correctness: per-job status , next_run_at , last_scheduled_at , a concurrency token ( version or lease), and a separate run-history table with a unique idempotency key.
A concrete scheduler loop: query due jobs → claim safely → emit one durable run record → advance next_run_at — and the indexing that makes the due-scan cheap at $10^6$ jobs.
Correct multi-instance behavior: the exact mechanism that prevents two replicas from double-firing the same tick.
Precise pause/resume semantics , including an explicit treatment of the pause-vs-claim race and the backfill-on-resume decision.
Crash recovery & no-lost-trigger reasoning: a durable hand-off that survives a crash between the DB write and the enqueue, stale- RUNNING detection via heartbeat/lease timeout, retries, and where at-least-once forces worker idempotency.
Scaling levers and bottlenecks given unlimited workers (due-scan efficiency, DB write throughput, queue publish rate; partitioning / time-bucketing).
Observability: scheduler lag, runs created/min, duplicate-fire count, failure rate, queue depth, and audit logs for pause/resume.

Follow-up Questions

Walk through exactly what happens if a scheduler instance crashes after inserting the run record but before the run reaches the queue. How does your design guarantee the trigger is neither lost nor double-delivered?
A single job is configured * * * * * (every minute) but each run takes ~5 minutes. How do you handle overlap — skip, queue, or allow concurrent runs — and how does the model express that policy?
The metadata store becomes the throughput bottleneck at $10^6$ jobs. How do you partition or shard scheduling so replicas don't contend, while preserving the no-double-fire guarantee?
How would you support per-job timezones and survive a daylight-saving transition without dropping or doubling the 2 a.m. run?

Design a Cron Job Scheduler

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP

Design a Cron Job Scheduler

Quick Overview

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP