How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Jane Street.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Jane Street during technical interviews.

Merge Two Insertion Diffs in Linear Time | Jane Street Coding Question

Merge Two Insertion Diffs in Linear Time

Company: Jane Street

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

A text-editing system represents changes to a document as **insertion diffs**. An insertion diff is a set of `(position, value)` pairs, where each pair means: "after this diff is applied, the element `value` sits at index `position` of the resulting document." All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right. Formally, applying a diff `D` to a sequence `S` produces a sequence `T` of length `len(S) + len(D)`: - For every pair `(p, v)` in `D`, `T[p] = v`. - The elements of `S`, in their original order, occupy the remaining indices of `T` in increasing index order. For example, applying `{0: "A", 3: "B"}` to `["z", "z", "z"]` yields `["A", "z", "z", "B", "z"]`. Now suppose two diffs are applied one after the other: 1. `diff_one` is applied to the original document. 2. `diff_two` is then applied to the result of step 1 — so its positions refer to indices of the **final** document. Continuing the example with the original document `["z", "z", "z"]`: - `diff_one = {0: "A", 3: "B"}` produces `["A", "z", "z", "B", "z"]`. - `diff_two = {0: "C", 4: "D"}` then produces `["C", "A", "z", "z", "D", "B", "z"]`. Write a function `merge_diffs(diff_one, diff_two)` that returns a **single** diff equivalent to applying `diff_one` followed by `diff_two`. In the example above, the merged diff is `{0: "C", 1: "A", 4: "D", 5: "B"}`: applying it directly to `["z", "z", "z"]` yields `["C", "A", "z", "z", "D", "B", "z"]` in one step. The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in $O(n + m)$ time, where $n$ is the number of insertions in `diff_one` and $m$ is the number of insertions in `diff_two` — not in time proportional to the document length or to the largest position value. **Input format** - Each diff is given as a list of `(position, value)` pairs sorted by increasing `position` (equivalently, a mapping whose keys iterate in increasing order). - Positions within a single diff are distinct non-negative integers. - Values are non-empty strings and should be treated as opaque. - You may assume both diffs are valid for the (unseen) document they are applied to: every position is a valid index of the corresponding post-application document. **Output format** Return the merged diff as a list of `(position, value)` pairs sorted by increasing `position`. Its size is exactly `n + m`. **Examples** Example 1: - Input: `diff_one = [(0, "A"), (3, "B")]`, `diff_two = [(0, "C"), (4, "D")]` - Output: `[(0, "C"), (1, "A"), (4, "D"), (5, "B")]` Example 2 (position collision between the two diffs): - Input: `diff_one = [(1, "P")]`, `diff_two = [(1, "Q")]` - Output: `[(1, "Q"), (2, "P")]` - Explanation: on an original document `["a", "b"]`, `diff_one` produces `["a", "P", "b"]`, and `diff_two` then produces `["a", "Q", "P", "b"]`. `diff_two`'s position is absolute in the final document, so `"Q"` occupies index 1 and `"P"` is pushed to index 2. Example 3 (one empty diff): - Input: `diff_one = []`, `diff_two = [(2, "X")]` - Output: `[(2, "X")]` **Constraints** - $0 \le n, m \le 2 \cdot 10^5$ - $0 \le \text{position} \le 10^9$ - Required time complexity: $O(n + m)$

Quick Answer: This question evaluates algorithm design and sequence-manipulation skills, focusing on position-mapped insertions, order-preserving merging, and attention to time/space complexity.

A text-editing system represents changes to a document as **insertion diffs**. An insertion diff is a set of `(position, value)` pairs, where each pair means: after this diff is applied, the element `value` sits at index `position` of the resulting document. All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right. Formally, applying a diff `D` to a sequence `S` produces a sequence `T` of length `len(S) + len(D)`: - For every pair `(p, v)` in `D`, `T[p] = v`. - The elements of `S`, in their original order, occupy the remaining indices of `T` in increasing index order. For example, applying `{0: "A", 3: "B"}` to `["z", "z", "z"]` yields `["A", "z", "z", "B", "z"]`. Now suppose two diffs are applied one after the other: 1. `diff_one` is applied to the original document. 2. `diff_two` is then applied to the result of step 1 — so its positions refer to indices of the **final** document. Continuing with the original document `["z", "z", "z"]`: `diff_one = {0: "A", 3: "B"}` produces `["A", "z", "z", "B", "z"]`, and `diff_two = {0: "C", 4: "D"}` then produces `["C", "A", "z", "z", "D", "B", "z"]`. Write `merge_diffs(diff_one, diff_two)` returning a **single** diff equivalent to applying `diff_one` followed by `diff_two`. For the example, the merged diff is `[(0, "C"), (1, "A"), (4, "D"), (5, "B")]`: applying it directly to `["z", "z", "z"]` yields the same final document in one step. The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in **O(n + m)** time, where `n = len(diff_one)` and `m = len(diff_two)` — not proportional to the document length or to the largest position value. **Input**: each diff is a list of `(position, value)` pairs sorted by increasing `position`; positions within one diff are distinct non-negative integers; values are opaque non-empty strings. Both diffs are valid for their (unseen) document. **Output**: the merged diff as a list of `(position, value)` pairs sorted by increasing `position`; its size is exactly `n + m`.

Constraints

0 <= n, m <= 2 * 10^5 (n = len(diff_one), m = len(diff_two))
0 <= position <= 10^9
Positions within a single diff are distinct; each diff is sorted by increasing position
Values are opaque non-empty strings
Required time complexity: O(n + m) — you may not materialize the document

Examples

Input: ([(0, 'A'), (3, 'B')], [(0, 'C'), (4, 'D')])

Expected Output: [(0, 'C'), (1, 'A'), (4, 'D'), (5, 'B')]

Explanation: The worked example. C takes final index 0, pushing A (was at T1 index 0) to 1; D takes index 4; B (was at T1 index 3) lands at 3 + 2 = 5.

Input: ([(1, 'P')], [(1, 'Q')])

Expected Output: [(1, 'Q'), (2, 'P')]

Explanation: Position collision. On doc ['a','b'], diff_one gives ['a','P','b']; diff_two's index 1 is absolute, so Q takes index 1 and P is pushed to index 2 (1 + d, d=1).

Input: ([], [(2, 'X')])

Expected Output: [(2, 'X')]

Explanation: diff_one is empty, so the merged diff is just diff_two unchanged.

Input: ([], [])

Expected Output: []

Explanation: Both diffs empty — nothing to insert.

Input: ([(0, 'A'), (3, 'B')], [])

Expected Output: [(0, 'A'), (3, 'B')]

Explanation: diff_two is empty (d stays 0), so every diff_one entry keeps its original position.

Input: ([(1, 'A'), (4, 'B')], [(0, 'C'), (3, 'D'), (7, 'E')])

Expected Output: [(0, 'C'), (2, 'A'), (3, 'D'), (6, 'B'), (7, 'E')]

Explanation: Ground-truth check on a len-4 original document. Intermediate T1 = [z,A,z,z,B,z]; applying diff_two yields final [C,z,A,D,z,z,B,E,z]. So A moves from T1 index 1 to 1+1=2, B moves from T1 index 4 to 4+2=6, and C,D,E sit at 0,3,7.

Hints

Do not build the document. Its length is unknown and can be far larger than n + m.
Think about where each diff_one insertion ends up in the FINAL document. The diff_two entries are the only things that shift it, and each shift is exactly +1.
Track d = how many diff_two entries have been emitted so far. A diff_one entry originally at T1 index p lands at final index p + d.
Merge the two sorted streams: at each step compare the diff_two position q with p + d. If q <= p + d, emit the diff_two entry (and increment d); otherwise emit the diff_one entry at p + d. Handle ties (q == p + d) in favor of diff_two — its slot is reserved.

Quick Overview

This question evaluates algorithm design and sequence-manipulation skills, focusing on position-mapped insertions, order-preserving merging, and attention to time/space complexity.