PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Snowflake

Design multi-core service startup scheduler

Last updated: Mar 29, 2026

Quick Overview

This question evaluates scheduling and resource-management skills, including DAG dependency handling, concurrency bounding, timeout/retry strategies, observability, and correctness reasoning for service startup orchestration.

  • hard
  • Snowflake
  • System Design
  • Software Engineer

Design multi-core service startup scheduler

Company: Snowflake

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Services must start on a host with M CPU cores, and each service may depend on others (a DAG). Design a scheduler that minimizes total startup time while respecting dependencies. Discuss: detecting ready services, maximizing parallelism within core limits, prioritization, bounding concurrency, handling timeouts/retries/failures, backoff, resource constraints (CPU, memory, ports), and observability. Describe data structures (e.g., in-degree tracking, ready queues), correctness properties, and how you would extend it across multiple machines.

Quick Answer: This question evaluates scheduling and resource-management skills, including DAG dependency handling, concurrency bounding, timeout/retry strategies, observability, and correctness reasoning for service startup orchestration.

Related Interview Questions

  • Design a Cron Job Scheduler - Snowflake (medium)
  • Design a disk-backed KV store under contention - Snowflake (easy)
  • Design an ACL authorization checking service - Snowflake (hard)
  • Design an object store with deduplication - Snowflake (medium)
  • Design a distributed system end-to-end - Snowflake (hard)
Snowflake logo
Snowflake
Sep 6, 2025, 12:00 AM
Software Engineer
Onsite
System Design
4
0

Service Startup Scheduler on a Host with M CPU Cores

Context

You are given a directed acyclic graph (DAG) of services where an edge u → v means service v depends on service u successfully starting and becoming healthy. All services must start on a single host that has M CPU cores. Each service may also consume additional resources (e.g., memory, ports) during startup. The goal is to minimize total startup time (makespan) while respecting dependencies and resource limits.

Task

Design a scheduler that:

  1. Detects which services are ready to start (dependency-respecting).
  2. Maximizes parallelism subject to the M-core limit and other resource constraints (CPU, memory, ports).
  3. Prioritizes work to minimize the overall makespan.
  4. Bounds concurrency globally and per resource class.
  5. Handles timeouts, retries, backoff, and failures.
  6. Provides strong observability (metrics, logs, traces) and correctness properties.
  7. Uses clear data structures (e.g., in-degree tracking, ready queues) and describes algorithmic complexity.
  8. Explains how to extend the design across multiple machines.

Assume each service reports healthy only after its health check passes. If the question does not specify per-service resource demands or durations, assume unit CPU per start and unknown duration with historical estimates.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Snowflake•More Software Engineer•Snowflake Software Engineer•Snowflake System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.