How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

What difficulty level is this interview question?

This is a medium difficulty ML System Design question, commonly asked during Technical Screen rounds at OpenAI.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at OpenAI during technical interviews.

Design a GPU-Efficient Video Service

Last updated: Apr 27, 2026

Quick Overview

This question evaluates competency in designing GPU-constrained, production-grade ML serving platforms, emphasizing resource management, job scheduling and prioritization, admission control, durable storage, failure handling, and observability within distributed systems.

OpenAI

Feb 23, 2026, 12:00 AM

Software Engineer

Technical Screen

ML System Design

Design a text-to-video generation platform similar to a modern generative video product. Treat the actual model inference on GPUs as a black box: a job enters a GPU worker and eventually produces a video.

Focus on the serving platform rather than model internals. The main requirements are:

Users submit prompts and generation parameters, receive a job ID, and later fetch the result.
GPU capacity is fixed or slow to scale, so the system cannot rely on instant autoscaling.
Traffic is bursty.
The system should maximize GPU utilization while still providing a predictable user experience.
The design should cover queueing, scheduling, admission control, prioritization, storage, failure handling, and observability.

Explain how you would design the APIs, control plane, worker architecture, scheduling strategy, and overload behavior.

Solution

Show

Submit Your Answer

Loading comments...

Browse More Questions

More ML System Design•More OpenAI•More Software Engineer•OpenAI Software Engineer•OpenAI ML System Design•Software Engineer ML System Design