Design a Dockerized GPU test pipeline

Q: Design a Dockerized GPU test pipeline

This is a System Design interview question from NVIDIA for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

Question

Design a Docker-Based Environment for Automated Graphics Tests on NVIDIA/AMD GPUs

Context

You need to design a reproducible, secure, and debuggable CI environment that runs automated graphics tests (e.g., Vulkan/OpenGL/EGL) in Docker on Linux hosts equipped with NVIDIA and/or AMD GPUs. The system should work headlessly and scale across CI agents.

Requirements

Describe a concrete approach covering:

Base images to use for NVIDIA and AMD, including dev vs. runtime variants.
Driver and runtime integration (e.g., NVIDIA Container Toolkit, ROCm/DRM), device exposure, and ICD/loader handling.
Headless rendering strategy (EGL/Vulkan vs. Xvfb) and test harness basics.
Image layering and caching strategy to speed builds.
Reproducibility: version pinning, driver/toolchain alignment, and environment capture.
Security: least-privilege containers, capabilities, device nodes, secrets management.
Debugging inside containers: tools, logging, profiling, core dumps, validation layers.
Handling flaky graphics tests: stabilization techniques and retry/quarantine policies.
Measuring and reducing CI runtime: metrics to track and optimizations to apply.

Deliverables

High-level architecture (host vs. container responsibilities; per-vendor specifics).
Example Dockerfiles (builder vs. runner), run flags, and minimal CI runner configuration.
A checklist of metrics and concrete actions to reduce runtime while keeping determinism.

Design a Dockerized GPU test pipeline

Design a Docker-Based Environment for Automated Graphics Tests on NVIDIA/AMD GPUs

Context

Requirements

Deliverables

Solution (Locked)

Comments (0)