PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/OpenAI

Design a GPU credit system and scheduler

Last updated: May 5, 2026

Quick Overview

This question evaluates system design, distributed systems, and resource-accounting skills focused on concurrency control, idempotent APIs, billing/credit models, and scheduler design for heterogeneous GPUs in multi-tenant ML platforms.

  • hard
  • OpenAI
  • ML System Design
  • Software Engineer

Design a GPU credit system and scheduler

Company: OpenAI

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Technical Screen

Design a GPU credit accounting and scheduling service for an ML platform. Users purchase credits, submit training/inference jobs, and consume credits while jobs run. Requirements: credit issuance, balance queries, reservation at submission, metered consumption during execution, partial refunds on preemption/failure, expiration and promotional credits, per-user and per-project budgets, and audit trails. The API must be idempotent and concurrency-safe, with rate limits and protection against double-spend under races. The scheduler should place jobs on heterogeneous GPUs (e.g., A100/H 100) based on resource requirements and available quota, supporting fairness across users/teams and preemption policies. Describe schemas and data structures, consistency choices (strong vs. eventual), handling of clock skew, sharding and scaling strategies, and observability. Outline a test plan that captures edge cases and uncovers unspecified requirements.

Quick Answer: This question evaluates system design, distributed systems, and resource-accounting skills focused on concurrency control, idempotent APIs, billing/credit models, and scheduler design for heterogeneous GPUs in multi-tenant ML platforms.

Related Interview Questions

  • Design a Text-to-Video Generation Service - OpenAI (medium)
  • Design a Text-to-Video Generation System - OpenAI (hard)
  • Design a Real-Time Sensor Intelligence System - OpenAI (medium)
  • Mine Novel Images from Unlabeled Data - OpenAI (medium)
  • Design a GPU-Efficient Video Service - OpenAI (medium)
OpenAI logo
OpenAI
Aug 13, 2025, 12:00 AM
Software Engineer
Technical Screen
ML System Design
44
0

Design a GPU Credit Accounting and Scheduling Service (Technical Screen)

You are designing a backend service for an ML platform that runs training and inference jobs on heterogeneous GPUs (e.g., A100, H100). Users and teams purchase credits and consume them while their jobs run. Design the system end to end: the credit ledger, the reservation/metering flow, and the scheduler that places jobs on GPUs.

The system is multi-tenant, multi-project, and multi-region, and must:

  • Prevent double-spend under concurrency, retries, and races.
  • Schedule fairly across users and teams.
  • Handle preemption and failures with correct partial refunds.

Assumptions

  • GPU pricing is per GPU-hour and differs by GPU type.
  • Jobs specify resource requirements: GPU-type preferences, GPU count, and memory.
  • Jobs may be preempted according to policy.

Functional Requirements

1. Credit lifecycle

  • Issuance (purchases, grants, promotions) and expiration .
  • Balance queries with a breakdown (promotional vs. paid, upcoming expirations).
  • Spend ordering across credit buckets (e.g., earliest-expiring first).

2. Reservation and metering

  • Idempotent reservation at job submission that checks budgets and quotas.
  • Metered consumption while a job runs: commit actual usage, and partially refund the unused hold on completion, preemption, or failure.

3. Budgets and quotas

  • Per-user and per-project budgets, with hierarchical limits (team/org → project → user).
  • Promotional credits with separate policies and expiration.

4. Scheduling

  • Place jobs on heterogeneous GPUs based on their requirements and available quota/credits.
  • Fairness across users/teams, with support for weights/priority classes and preemption .

5. Audit and observability

  • An immutable audit trail for all credit and scheduling decisions.
  • Metrics, logs, and traces for SLOs and debugging.

Non-Functional Requirements

  • APIs must be idempotent and concurrency-safe , with rate limits.
  • Protect against double-spend under races and retries.
  • State your consistency choices explicitly (strong vs. eventual) and handle clock skew .
  • Describe sharding/scaling strategies for high throughput.

Deliverables

Address each of the following:

  1. Architecture overview — components and data flow.
  2. Data schemas and key data structures.
  3. API design and idempotency model.
  4. Scheduling algorithm and preemption policies.
  5. Consistency model and concurrency control — including double-spend protection and clock-skew handling.
  6. Sharding and scaling strategy.
  7. Observability plan.
  8. A test plan that exercises edge cases and surfaces unspecified requirements.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.