Explain Amdahl’s law and GPU matmul optimization

Q: Explain Amdahl’s law and GPU matmul optimization

This is a Software Engineering Fundamentals interview question from NVIDIA for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Software Engineering Fundamentals interview questions?

Software Engineering Fundamentals questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master software engineering fundamentals interviews.

Question

Loading...

Answer the following systems/performance fundamentals questions (as in a GPU/ML infra interview). Assume a modern NVIDIA-like GPU architecture unless otherwise stated.

Amdahl’s law : What is it, what does it imply about parallel speedup, and how do you use it to reason about optimizations?
GPU memory hierarchy : Compare registers , shared memory / SRAM , L1/L2 cache , and HBM/global memory . What are typical latency/bandwidth trade-offs, and what code patterns map well to each level?
Threading limits :
- What is a warp/wavefront ?
- What limits the maximum number of concurrent threads (per block and per SM), and how do registers/shared-memory usage affect occupancy ?
Matrix multiplication (matmul) :
- What is the time complexity of multiplying an $m\times k$ matrix by a $k\times n$ matrix?
- How does a tiled GPU implementation work conceptually (what is “tiling/blocking” and why does it help)?
CPU vs GPU matmul : Why are high-performance implementations different on CPU vs GPU? Discuss SIMD, cache behavior, memory bandwidth, and parallelism.
C++ fundamentals :
- What is a virtual function , and what runtime cost does it introduce?
- What does inline mean in C++? When is inlining likely/unsafe/unhelpful?

Explain Amdahl’s law and GPU matmul optimization

Comments (0)