Implement IPv4 iterators and CIDR expansion
Company: OpenAI
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
## Problem: IPv4 Iterators and CIDR Expansion
You are implementing utilities to iterate over IPv4 addresses.
An IPv4 address is in dotted-decimal form: `A.B.C.D`, where each octet is an integer in `[0, 255]`. Conceptually, an IP can be treated as an unsigned 32-bit integer:
\[
(A \ll 24) + (B \ll 16) + (C \ll 8) + D
\]
### Part 1 — Forward Iterator
Implement a Python class `IPv4Iterator(start_ip: str)` that iterates forward from `start_ip` (inclusive) up to `255.255.255.255` (inclusive).
- Must implement the Python iterator protocol: `__init__`, `__iter__`, `__next__`.
- Each `__next__()` returns the next IP as a dotted string.
- When the iteration is finished, raise `StopIteration`.
Example:
- Input: `start_ip = "192.168.0.1"`
- Output sequence: `"192.168.0.1"`, `"192.168.0.2"`, ... , `"255.255.255.255"`
### Part 2 — Reverse Iterator
Implement a reverse iterator `IPv4ReverseIterator(end_ip: str)` that iterates backward from `end_ip` (inclusive) down to `0.0.0.0` (inclusive).
Example:
- Input: `end_ip = "192.168.0.255"`
- Output sequence: `"192.168.0.255"`, `"192.168.0.254"`, ... , `"0.0.0.0"`
### Part 3 — CIDR Range Expansion
Given a CIDR string like `"192.168.1.0/24"`, compute the **start IP** and **end IP** of the CIDR range using bitwise operations, then iterate through and return all IPs in that range (inclusive).
CIDR format: `base_ip/prefix_len`
- `base_ip` is a dotted IPv4 string.
- `prefix_len` is an integer `n` in `[0, 32]` meaning the first `n` bits are fixed (network bits) and the remaining `32-n` bits vary (host bits).
Requirements:
- Use bitwise masking to compute:
- `start_ip` (network address)
- `end_ip` (broadcast/highest address)
- Then iterate all addresses from `start_ip` to `end_ip` inclusive.
Example:
- Input: `"192.168.1.0/24"`
- Start: `"192.168.1.0"`
- End: `"192.168.1.255"`
### Part 4 — Optimization Discussion
Discuss how you would optimize time and space complexity for the above tasks, especially for large CIDR ranges (e.g., `/8` or `/0`).
### Assumptions / Clarifications
- You may assume inputs are valid IPv4/CIDR strings unless you choose to add validation.
- Clearly define behavior at boundaries (e.g., starting at `255.255.255.255`, ending at `0.0.0.0`).
Quick Answer: English summary: This question evaluates proficiency with IP addressing, bitwise operations, and iterator implementation in code, testing competencies in binary arithmetic, CIDR range calculation, and memory- and time-aware iteration.
Part 1: Forward IPv4 Iterator Slice
Generate the first `n` IPv4 addresses produced by iterating **forward** from a starting address, treating each address as a 32-bit counter that increments by one.
## Task
Implement:
```
solution(start_ip, n)
```
- **`start_ip`** — a string holding a valid IPv4 address in dotted-decimal form (four octets `0`–`255` separated by `.`, e.g. `"192.168.0.1"`).
- **`n`** — an integer count of addresses to produce.
Return a **list of strings**: the address sequence beginning at `start_ip` (inclusive) and stepping forward one address at a time.
## How forward iteration works
Interpret the four octets as a single 32-bit number (`a·2²⁴ + b·2¹⁶ + c·2⁸ + d`). The **next** address is that number plus one, decoded back into dotted-decimal. This means an octet rolls over into the next when it passes `255`:
- `192.168.0.255` → `192.168.1.0`
So the returned list is `start_ip`, `start_ip + 1`, `start_ip + 2`, …, up to `n` addresses total.
## Rules
- **Inclusive start:** the first element of the result is always `start_ip` (when `n` produces at least one address).
- **Stop at the ceiling:** `255.255.255.255` is the largest valid address. If the forward sequence reaches it before `n` addresses have been produced, **stop there** — never produce an address greater than `255.255.255.255`. The result may therefore contain **fewer than `n`** entries.
- **Non-positive `n`:** if `n <= 0` (in particular `n == 0`), return an empty list `[]`.
## Examples
| `start_ip` | `n` | Output |
|---|---|---|
| `"192.168.0.1"` | `3` | `["192.168.0.1", "192.168.0.2", "192.168.0.3"]` |
| `"192.168.0.255"` | `3` | `["192.168.0.255", "192.168.1.0", "192.168.1.1"]` |
| `"255.255.255.255"` | `3` | `["255.255.255.255"]` |
| `"0.0.0.0"` | `0` | `[]` |
## Constraints
- `start_ip` is a valid IPv4 address.
- `0 <= n <= 100000`
- Do not produce any address greater than `255.255.255.255`.
Constraints
- start_ip is a valid IPv4 address.
- 0 <= n <= 100000
- Do not produce any address greater than 255.255.255.255.
Examples
Input: ('192.168.0.1', 3)
Expected Output: ['192.168.0.1', '192.168.0.2', '192.168.0.3']
Explanation: A basic forward walk from a normal starting address.
Input: ('192.168.0.255', 3)
Expected Output: ['192.168.0.255', '192.168.1.0', '192.168.1.1']
Explanation: This crosses an octet boundary.
Input: ('255.255.255.255', 3)
Expected Output: ['255.255.255.255']
Explanation: Edge case: the iterator starts at the final IPv4 address.
Input: ('0.0.0.0', 0)
Expected Output: []
Explanation: Edge case: requesting zero values should return an empty list.
Hints
- Convert the IPv4 string into a 32-bit integer so incrementing becomes simple.
- The largest possible IPv4 value is 2^32 - 1.
Part 2: Reverse IPv4 Iterator Slice
Return the first **n** IPv4 addresses produced by iterating *backward* from a given ending address.
An IPv4 address can be treated as a single unsigned 32-bit integer formed from its four octets (`a.b.c.d` → `(a << 24) | (b << 16) | (c << 8) | d`). Iterating backward means repeatedly moving to the **previous** address, i.e. decrementing that integer by one. Borrows cross octet boundaries automatically — for example, the address before `10.0.0.0` is `9.255.255.255`.
## What to implement
```python
def solution(end_ip, n):
...
```
Given:
- **`end_ip`** — a string holding a valid IPv4 address (the address to start from).
- **`n`** — an integer count of addresses to produce.
Return a **list of IPv4 address strings**, starting with `end_ip` itself and continuing downward, one address lower each step.
## Rules
- The result is **inclusive of `end_ip`**: the first element of the list is `end_ip`.
- Each subsequent element is the next-lower IPv4 address (the 32-bit integer value minus one).
- Produce up to **`n`** addresses in total.
- **Floor at `0.0.0.0`:** never go below `0.0.0.0`. If the descending sequence reaches `0.0.0.0` before `n` addresses have been produced, stop there. In that case the returned list contains fewer than `n` elements (it ends with `0.0.0.0`).
- If **`n <= 0`**, return an empty list.
## Examples
- `solution("192.168.0.255", 3)` → `["192.168.0.255", "192.168.0.254", "192.168.0.253"]`
- `solution("10.0.0.0", 2)` → `["10.0.0.0", "9.255.255.255"]` (borrow across octets)
- `solution("1.0.0.0", 3)` → `["1.0.0.0", "0.255.255.255", "0.255.255.254"]`
- `solution("0.0.0.0", 5)` → `["0.0.0.0"]` (already at the floor, so only one address is returned)
## Constraints
- `end_ip` is a valid IPv4 address.
- `0 <= n <= 100000`
- Do not produce any address smaller than `0.0.0.0`.
Constraints
- end_ip is a valid IPv4 address.
- 0 <= n <= 100000
- Do not produce any address smaller than 0.0.0.0.
Examples
Input: ('192.168.0.255', 3)
Expected Output: ['192.168.0.255', '192.168.0.254', '192.168.0.253']
Explanation: A normal reverse walk.
Input: ('10.0.0.0', 2)
Expected Output: ['10.0.0.0', '9.255.255.255']
Explanation: This crosses an octet boundary while moving backward.
Input: ('0.0.0.0', 5)
Expected Output: ['0.0.0.0']
Explanation: Edge case: the iterator starts at the minimum IPv4 address.
Input: ('1.0.0.0', 3)
Expected Output: ['1.0.0.0', '0.255.255.255', '0.255.255.254']
Explanation: This crosses the highest octet boundary.
Hints
- Treat the IPv4 address as an unsigned 32-bit integer and subtract 1 at each step.
- The smallest possible IPv4 value is 0.
Part 3: Expand a CIDR Block
Expand a **CIDR block** into the full list of IPv4 addresses it contains.
You are given a single string `cidr` in the form `base_ip/prefix_len` (for example, `"192.168.1.0/30"`). Treat each IPv4 address as a **32-bit unsigned integer** and use **bitwise masking** to compute the block's boundaries, then enumerate every address in the inclusive range.
## What to implement
```
def solution(cidr):
...
```
Given the CIDR string, do the following:
1. **Parse** `cidr` into the base IP and the integer prefix length.
2. **Compute the network start address** by clearing the host bits of the base IP (mask off the lowest `32 - prefix_len` bits). The base IP may **not** already be aligned to the network boundary — if it isn't, snap it *down* to the network address.
3. **Compute the highest address** in the block by setting all of those host bits to `1`.
4. **Enumerate** every IPv4 address from the start address to the highest address, **inclusive**, in ascending order.
## Input
- `cidr` — a string `"A.B.C.D/prefix_len"`, where `A.B.C.D` is a dotted-quad IPv4 address and `prefix_len` is an integer.
## Output
Return a **tuple of three values** `(start_ip, end_ip, all_addresses)`:
- **`start_ip`** — the network start address, as a dotted-quad string.
- **`end_ip`** — the highest address in the block, as a dotted-quad string.
- **`all_addresses`** — a **list** of every IPv4 address in the inclusive range `[start_ip, end_ip]`, each as a dotted-quad string, in ascending order.
All host addresses are included — do **not** exclude the network or broadcast addresses.
## Examples
| Input | Output |
|-------|--------|
| `"192.168.1.0/30"` | `("192.168.1.0", "192.168.1.3", ["192.168.1.0", "192.168.1.1", "192.168.1.2", "192.168.1.3"])` |
| `"172.16.5.9/29"` | `("172.16.5.8", "172.16.5.15", ["172.16.5.8", "172.16.5.9", ..., "172.16.5.15"])` |
| `"10.0.0.5/32"` | `("10.0.0.5", "10.0.0.5", ["10.0.0.5"])` |
| `"0.0.0.0/31"` | `("0.0.0.0", "0.0.0.1", ["0.0.0.0", "0.0.0.1"])` |
Note in the second example that the base IP `172.16.5.9` is not aligned to the `/29` boundary, so the start address snaps down to `172.16.5.8`.
## Constraints
- `cidr` is valid and has a prefix length in the range **0 to 32**.
- For this problem, the expanded range size will be at most **4096 addresses**.
- Use **bitwise masking** to compute the network and end addresses.
Constraints
- cidr is valid and has a prefix length in the range 0 to 32.
- For this problem, the expanded range size will be at most 4096 addresses.
- Use bitwise masking to compute the network and end addresses.
Examples
Input: '192.168.1.0/30'
Expected Output: ('192.168.1.0', '192.168.1.3', ['192.168.1.0', '192.168.1.1', '192.168.1.2', '192.168.1.3'])
Explanation: A /30 block contains 4 addresses.
Input: '172.16.5.9/29'
Expected Output: ('172.16.5.8', '172.16.5.15', ['172.16.5.8', '172.16.5.9', '172.16.5.10', '172.16.5.11', '172.16.5.12', '172.16.5.13', '172.16.5.14', '172.16.5.15'])
Explanation: The input IP is not aligned; masking moves the start to 172.16.5.8.
Input: '10.0.0.5/32'
Expected Output: ('10.0.0.5', '10.0.0.5', ['10.0.0.5'])
Explanation: Edge case: a /32 contains exactly one address.
Input: '0.0.0.0/31'
Expected Output: ('0.0.0.0', '0.0.0.1', ['0.0.0.0', '0.0.0.1'])
Explanation: Edge case: a /31 contains two addresses.
Hints
- To get the network address, keep the first prefix_len bits and zero out the rest.
- To get the highest address in the CIDR block, set all host bits to 1.
Part 4: Optimized CIDR Summary Without Full Expansion
Summarize a CIDR block without expanding every address it contains.
Some CIDR blocks (for example `/8` or `/0`) cover billions of addresses, so materializing the whole range is not an option. Your task is to produce a **compact summary** of the block using only its endpoints and a small sample of addresses from each end.
## Function
```
solution(cidr, k)
```
- **`cidr`** — a string in `"A.B.C.D/prefix"` form (a dotted-quad IPv4 address followed by a prefix length), e.g. `"192.168.1.0/30"`.
- **`k`** — an integer controlling how many addresses to sample from each end (`0 <= k <= 1000`).
## What to compute
A CIDR block of prefix length `p` always describes a **contiguous** range of `2^(32 - p)` IPv4 addresses. Derive the following:
1. **count** — the total number of addresses in the block.
2. **start IP** — the network (lowest) address. **Clear the host bits** of the given IP first: any host bits set in the input are masked off, so `"1.2.3.4/24"` has start `"1.2.3.0"`.
3. **end IP** — the highest address in the block.
4. **first k addresses** — the first `k` addresses, in ascending order starting from the start IP.
5. **last k addresses** — the last `k` addresses, in ascending order ending at the end IP.
## Return value
Return a tuple of five values, in this exact order:
```
(count, start_ip, end_ip, first_k, last_k)
```
- **`count`** is an integer.
- **`start_ip`** and **`end_ip`** are dotted-quad strings (e.g. `"192.168.1.0"`).
- **`first_k`** and **`last_k`** are lists of dotted-quad strings.
## Rules and edge cases
- **`k = 0`:** both `first_k` and `last_k` are **empty lists** `[]`. (More generally, any `k <= 0` yields two empty lists.)
- **Block no larger than `k`:** if the block contains at most `k` addresses, then `first_k` and `last_k` are **identical** — each is the **entire block** (every address, in ascending order).
- A `/32` block contains a single address, so its start and end IPs are equal and each sample list (for `k >= 1`) holds just that one address.
- A `/0` block contains all `4294967296` IPv4 addresses; handle it with the same arithmetic — do **not** enumerate the full range.
## Examples
| `cidr` | `k` | Result |
|--------|-----|--------|
| `"192.168.1.0/30"` | `2` | `(4, "192.168.1.0", "192.168.1.3", ["192.168.1.0", "192.168.1.1"], ["192.168.1.2", "192.168.1.3"])` |
| `"10.0.0.5/32"` | `3` | `(1, "10.0.0.5", "10.0.0.5", ["10.0.0.5"], ["10.0.0.5"])` |
| `"0.0.0.0/0"` | `1` | `(4294967296, "0.0.0.0", "255.255.255.255", ["0.0.0.0"], ["255.255.255.255"])` |
| `"1.2.3.4/24"` | `0` | `(256, "1.2.3.0", "1.2.3.255", [], [])` |
## Constraints
- `cidr` is valid and has a prefix length in the range `0` to `32`.
- `0 <= k <= 1000`.
- The block may be as large as `/0`, so expanding the full range is **not allowed** — generate at most about `2k` addresses regardless of block size.
Constraints
- cidr is valid and has a prefix length in the range 0 to 32.
- 0 <= k <= 1000
- The CIDR block may be as large as /0, so expanding the full range is not allowed.
Examples
Input: ('192.168.1.0/30', 2)
Expected Output: (4, '192.168.1.0', '192.168.1.3', ['192.168.1.0', '192.168.1.1'], ['192.168.1.2', '192.168.1.3'])
Explanation: A small block where only part of the range is sampled from each side.
Input: ('10.0.0.5/32', 3)
Expected Output: (1, '10.0.0.5', '10.0.0.5', ['10.0.0.5'], ['10.0.0.5'])
Explanation: Edge case: the block has only one address, so both samples contain the full block.
Input: ('0.0.0.0/0', 1)
Expected Output: (4294967296, '0.0.0.0', '255.255.255.255', ['0.0.0.0'], ['255.255.255.255'])
Explanation: Huge range: the answer must be produced without expanding all addresses.
Input: ('1.2.3.4/24', 0)
Expected Output: (256, '1.2.3.0', '1.2.3.255', [], [])
Explanation: Edge case: when k is zero, the sample lists are empty.
Hints
- A CIDR block with prefix p contains 2^(32 - p) addresses.
- You can compute the first k and last k addresses directly from the integer start and end values.