Trees and Binary Search Trees: Taxonomy, Invariant, and Traversal/Search
Company: Sig
Role: Software Engineer
Category: Software Engineering Fundamentals
Difficulty: medium
Interview Round: Onsite
This is a fundamentals discussion with a small coding component. Walk an interviewer through trees as a data-structure family, then zoom into binary search trees:
1. **Survey the common kinds of trees** and, for each, give a realistic use case (where it actually shows up in systems).
2. **Define a binary search tree (BST) precisely** — state its ordering invariant exactly, not just "left is smaller."
3. **Implement two operations on a BST**: (a) an in-order traversal that prints keys in sorted order, and (b) a search that returns whether a key is present. State the time and space complexity of each, and how complexity depends on the tree's shape.
```hint Organize the zoo
Group trees by their defining property rather than listing them randomly: general vs. binary by branching factor; search-ordered (BST, AVL, red-black); heap-ordered (binary heap); multi-way on-disk (B-tree / B+-tree); and prefix/string (trie). Anchor each to one concrete use.
```
```hint State the BST invariant exactly
The invariant is recursive and applies to whole subtrees, not just immediate children: for every node `x`, every key in `x`'s left subtree is `< x.key` and every key in `x`'s right subtree is `> x.key`. That single property is what makes search and ordered iteration work in `O(height)` / sorted order.
```
```hint Traversal and search
In-order = visit left subtree, then the node, then the right subtree; for a BST that emits keys ascending. Search compares the target to the current node and walks left or right. A full traversal is `O(n)`; a search costs `O(height)`.
```
### Constraints & Assumptions
- Focus on conceptual clarity plus correct, idiomatic code (C++ or clear pseudocode is fine).
- Treat the BST as unbalanced unless balancing is explicitly requested, but be ready to discuss balancing as a tradeoff.
- Keys are comparable and, unless stated otherwise, unique.
### Clarifying Questions to Ask
- Should I assume the BST is balanced, or do you want me to address the worst-case skew (degenerate-to-a-list) and how balancing fixes it?
- Are duplicate keys allowed? If so, which subtree convention should they follow (e.g. duplicates go right)?
- For the traversal, do you want recursion, or an iterative/`O(1)`-extra-space version (explicit stack / Morris traversal)?
- Is the tree read-only for this question, or are insert/delete also in scope?
### What a Strong Answer Covers
```premium-lock What a Strong Answer Covers
```
### Follow-up Questions
- How do AVL trees and red-black trees keep operations `O(log n)`, and what is the cost/benefit difference between them (rotation frequency vs. lookup speed)?
- Why do databases and filesystems favor B-trees / B+-trees over binary BSTs for on-disk indexes?
- How would you verify in `O(n)` that an arbitrary binary tree is a valid BST? (What's the subtle bug in checking only parent-vs-child?)
- When is a hash table a strictly better choice than a BST, and when is the BST's ordering property essential (so a hash table cannot replace it)?
Quick Answer: This question evaluates a candidate's grasp of tree data structures and the binary search tree ordering invariant, along with the ability to implement traversal and search operations. It is a common software engineering fundamentals topic used to check whether someone can precisely define an invariant, write correct code, and reason about time and space complexity relative to tree shape.