C++ Fundamentals: Memory, Compilation, and Containers
Company: Hudson
Role: Software Engineer
Category: Software Engineering Fundamentals
Difficulty: hard
Interview Round: Technical Screen
## C++ Fundamentals: Memory, Compilation, and Containers
This is a rapid-fire C++ fundamentals interview, of the kind used at low-latency / high-frequency trading firms to probe how deeply a candidate understands what the language and the machine are actually doing underneath their code. You will be asked about the `inline` keyword, dynamic allocation with `new`, the trade-offs between templates and inheritance, the differences between an ordered map and a hash map, and what a segmentation fault really is. Throughout, the interviewer cares far more about *mechanism* and *cost* than about textbook definitions: be ready to talk about the call stack, code size, cache behavior, the heap allocator, system calls, and undefined behavior.
### Constraints & Assumptions
- Target language is modern C++ (C++17 or later) compiled ahead-of-time with an optimizing compiler (e.g. GCC/Clang at `-O2`) for a 64-bit Linux/x86-64 host.
- "Map" refers to `std::map` and "hash map" to `std::unordered_map` from the C++ standard library.
- Performance discussion is in the context of a single-threaded, latency-sensitive process; rough orders of magnitude matter more than exact cycle counts, but you should know the *relative* costs.
- You may assume a standard glibc `malloc` (ptmalloc2) when discussing the heap, while noting that allocator behavior is implementation-defined.
### Clarifying Questions to Ask
- Is the target an optimized release build, or an unoptimized debug build? (`inline` semantics and whether functions are actually inlined differ greatly between the two.)
- Which C++ standard and which compiler/platform should I assume? (Allocator internals, ABI, and library guarantees vary.)
- When you say "performance," do you mean latency of a single call, total throughput, instruction-cache pressure, or memory footprint? They pull in different directions.
- For the container questions, should I assume the default allocator and default hash, or are custom allocators / custom hash functions in play?
- Are we optimizing for the common path, or do you also care about worst-case / tail latency (which matters a lot in trading systems)?
### Part 1 — The `inline` keyword
Explain what the `inline` keyword means in C++. The interviewer will push on three things in turn: (a) what `inline` actually guarantees versus what it merely hints; (b) you mentioned the call stack and a *bigger binary size* — walk through exactly why inlining affects both; and (c) the performance trade-off — is inlining always faster, or can a *non-inlined* call sometimes give better latency?
```hint What inline really guarantees
Separate two things that are easy to conflate: the language-level meaning of `inline` (a linkage/ODR rule about allowing multiple identical definitions across translation units) versus the optimization of *inline expansion* (substituting the function body at the call site). The compiler decides the latter largely on its own.
```
```hint Why it can be slower
Inlining duplicates the function body at every call site. Think about what that does to the size of the emitted machine code, and what a larger hot code footprint does to the instruction cache (I-cache) and the CPU front-end.
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 2 — Dynamic allocation: `new`, `malloc`, the OS, and heap vs. stack
What does the `new` keyword do? The interviewer drills down: describe `malloc` in detail; does the operating system get involved in this process; how would you *avoid* involving the OS on the hot path; and you claimed the heap is slower than the stack — quantify that, roughly.
```hint Two steps, not one
`new` is not a single primitive. Separate the *allocation* of raw storage (which is where `malloc` / `operator new` lives) from the *construction* (running the object's constructor in that storage). Free uses the mirror image: destructor, then deallocation.
```
```hint Where the OS does and does not appear
A user-space allocator like glibc `malloc` does not call into the kernel on every allocation. It grabs large regions from the OS occasionally (via `brk`/`sbrk` for the main arena or `mmap` for big requests) and then hands out small pieces from those regions itself. Think about which allocations are "fast path" (no syscall) versus which force a syscall.
```
```hint Heap-vs-stack cost and how to avoid the allocator
Stack allocation is essentially adjusting a register (the stack pointer) — a handful of cycles, and the memory is almost always hot in cache. A heap allocation runs allocator bookkeeping (free-list/bin search, possible locking, possible syscall). To avoid touching the allocator on the hot path, think about reusing memory you already own.
```
#### Clarifying Questions for this Part
- Should I describe the *language* contract of `new` (allocate + construct, throwing on failure) or also dig into a specific allocator implementation such as glibc ptmalloc?
- Is this a multi-threaded process? (Allocator lock contention and per-thread arenas change the cost story significantly.)
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 3 — Templates vs. inheritance
Compare templates and inheritance as mechanisms for writing reusable / polymorphic code. What are the trade-offs of each, and when would you reach for one over the other?
```hint Two kinds of polymorphism
Frame it as compile-time (static) polymorphism versus run-time (dynamic) polymorphism. Templates resolve types at compile time and dispatch is statically known; inheritance with `virtual` resolves the call target at run time through a vtable pointer.
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 4 — `std::map` vs. `std::unordered_map`
Compare `std::map` and `std::unordered_map`. The interviewer follows up on (a) the worst case for each, and (b) concretely, how does a hash map find the right bucket for a key?
```hint Different data structures underneath
These are not two flavors of the same thing. One is a balanced binary search tree (ordered, comparison-based); the other is a hash table (unordered, hash-based). That difference drives ordering, complexity, iteration order, and memory layout.
```
```hint Worst case for the hash map
Average $O(1)$ lookup assumes keys spread across buckets. Ask what happens when many keys collide into the same bucket — what does a lookup degrade to, and what real situations (bad hash, adversarial input) cause it?
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### Part 5 — Segmentation faults (and a `nullptr` reference)
Explain what a segmentation fault is. Then a pointed follow-up: suppose you have a *reference* and you try to make it refer to a dereferenced null pointer — can you walk through what's going on there?
```hint What a segfault actually is
A segfault is the hardware + OS catching an invalid memory access: the CPU's MMU raises a fault when a process touches a virtual page it has no right to (unmapped, or wrong permission — e.g. writing read-only). The kernel delivers `SIGSEGV`. Tie it to *how* you'd produce one in C++.
```
```hint The reference-to-null trap
Binding a reference such as `int& r = *p;` where `p` is null does **not** copy anything — it's just an alias. The dereference of a null pointer is undefined behavior; the crash, if any, happens when the reference is actually *used*, not necessarily at the binding. Distinguish "what the standard says" (UB) from "what typically happens" (segfault on access).
```
#### What This Part Should Cover
```premium-lock What This Part Should Cover
```
### What a Strong Answer Covers
```premium-lock What a Strong Answer Covers
```
### Follow-up Questions
- For Part 1: when would you use `__attribute__((always_inline))` / `[[gnu::always_inline]]` or, conversely, `[[gnu::noinline]]`, and how would you *measure* whether inlining actually helped?
- For Part 2: how do per-thread arenas and lock contention in a multi-threaded `malloc` change your "avoid the allocator on the hot path" advice? Sketch how a simple fixed-size object pool works.
- For Part 4: how does `std::unordered_map` protect (or fail to protect) against hash collisions, and how would you mitigate an adversary who can choose keys to force worst-case behavior?
- For Part 5: how do tools like AddressSanitizer or a guard-page allocator turn *silent* undefined behavior (which may not crash) into a deterministic, debuggable fault?
Quick Answer: This question tests deep knowledge of C++ internals, covering memory management, compilation mechanics, and standard container trade-offs. It evaluates practical understanding of how language constructs map to machine behavior — including heap allocation, inlining, template vs. inheritance design, and undefined behavior — skills central to software engineering roles in performance-sensitive environments.