How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

What difficulty level is this interview question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Jane Street.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Jane Street during technical interviews.

Merge Two Insertion Diffs in Linear Time | Jane Street Interview Question

Q: Merge Two Insertion Diffs in Linear Time

This is a Coding & Algorithms interview question from Jane Street for Software Engineer roles. View the full question and solution on PracHub.

Merge Two Insertion Diffs in Linear Time

Company: Jane Street

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

A text-editing system represents changes to a document as **insertion diffs**. An insertion diff is a set of `(position, value)` pairs, where each pair means: "after this diff is applied, the element `value` sits at index `position` of the resulting document." All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right. Formally, applying a diff `D` to a sequence `S` produces a sequence `T` of length `len(S) + len(D)`: - For every pair `(p, v)` in `D`, `T[p] = v`. - The elements of `S`, in their original order, occupy the remaining indices of `T` in increasing index order. For example, applying `{0: "A", 3: "B"}` to `["z", "z", "z"]` yields `["A", "z", "z", "B", "z"]`. Now suppose two diffs are applied one after the other: 1. `diff_one` is applied to the original document. 2. `diff_two` is then applied to the result of step 1 — so its positions refer to indices of the **final** document. Continuing the example with the original document `["z", "z", "z"]`: - `diff_one = {0: "A", 3: "B"}` produces `["A", "z", "z", "B", "z"]`. - `diff_two = {0: "C", 4: "D"}` then produces `["C", "A", "z", "z", "D", "B", "z"]`. Write a function `merge_diffs(diff_one, diff_two)` that returns a **single** diff equivalent to applying `diff_one` followed by `diff_two`. In the example above, the merged diff is `{0: "C", 1: "A", 4: "D", 5: "B"}`: applying it directly to `["z", "z", "z"]` yields `["C", "A", "z", "z", "D", "B", "z"]` in one step. The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in $O(n + m)$ time, where $n$ is the number of insertions in `diff_one` and $m$ is the number of insertions in `diff_two` — not in time proportional to the document length or to the largest position value. **Input format** - Each diff is given as a list of `(position, value)` pairs sorted by increasing `position` (equivalently, a mapping whose keys iterate in increasing order). - Positions within a single diff are distinct non-negative integers. - Values are non-empty strings and should be treated as opaque. - You may assume both diffs are valid for the (unseen) document they are applied to: every position is a valid index of the corresponding post-application document. **Output format** Return the merged diff as a list of `(position, value)` pairs sorted by increasing `position`. Its size is exactly `n + m`. **Examples** Example 1: - Input: `diff_one = [(0, "A"), (3, "B")]`, `diff_two = [(0, "C"), (4, "D")]` - Output: `[(0, "C"), (1, "A"), (4, "D"), (5, "B")]` Example 2 (position collision between the two diffs): - Input: `diff_one = [(1, "P")]`, `diff_two = [(1, "Q")]` - Output: `[(1, "Q"), (2, "P")]` - Explanation: on an original document `["a", "b"]`, `diff_one` produces `["a", "P", "b"]`, and `diff_two` then produces `["a", "Q", "P", "b"]`. `diff_two`'s position is absolute in the final document, so `"Q"` occupies index 1 and `"P"` is pushed to index 2. Example 3 (one empty diff): - Input: `diff_one = []`, `diff_two = [(2, "X")]` - Output: `[(2, "X")]` **Constraints** - $0 \le n, m \le 2 \cdot 10^5$ - $0 \le \text{position} \le 10^9$ - Required time complexity: $O(n + m)$

A text-editing system represents changes to a document as insertion diffs. An insertion diff is a set of (position, value) pairs, where each pair means: "after this diff is applied, the element value sits at index position of the resulting document." All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right.

Formally, applying a diff D to a sequence S produces a sequence T of length len(S) + len(D):

For every pair (p, v) in D , T[p] = v .
The elements of S , in their original order, occupy the remaining indices of T in increasing index order.

For example, applying {0: "A", 3: "B"} to ["z", "z", "z"] yields ["A", "z", "z", "B", "z"].

Now suppose two diffs are applied one after the other:

diff_one is applied to the original document.
diff_two is then applied to the result of step 1 — so its positions refer to indices of the final document.

Continuing the example with the original document ["z", "z", "z"]:

diff_one = {0: "A", 3: "B"} produces ["A", "z", "z", "B", "z"] .
diff_two = {0: "C", 4: "D"} then produces ["C", "A", "z", "z", "D", "B", "z"] .

Write a function merge_diffs(diff_one, diff_two) that returns a single diff equivalent to applying diff_one followed by diff_two. In the example above, the merged diff is {0: "C", 1: "A", 4: "D", 5: "B"}: applying it directly to ["z", "z", "z"] yields ["C", "A", "z", "z", "D", "B", "z"] in one step.

The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in $O(n + m)$ time, where $n$ is the number of insertions in diff_one and $m$ is the number of insertions in diff_two — not in time proportional to the document length or to the largest position value.

Input format

Each diff is given as a list of (position, value) pairs sorted by increasing position (equivalently, a mapping whose keys iterate in increasing order).
Positions within a single diff are distinct non-negative integers.
Values are non-empty strings and should be treated as opaque.
You may assume both diffs are valid for the (unseen) document they are applied to: every position is a valid index of the corresponding post-application document.

Output format

Return the merged diff as a list of (position, value) pairs sorted by increasing position. Its size is exactly n + m.

Examples

Example 1:

Input: diff_one = [(0, "A"), (3, "B")] , diff_two = [(0, "C"), (4, "D")]
Output: [(0, "C"), (1, "A"), (4, "D"), (5, "B")]

Example 2 (position collision between the two diffs):

Input: diff_one = [(1, "P")] , diff_two = [(1, "Q")]
Output: [(1, "Q"), (2, "P")]
Explanation: on an original document ["a", "b"] , diff_one produces ["a", "P", "b"] , and diff_two then produces ["a", "Q", "P", "b"] . diff_two 's position is absolute in the final document, so "Q" occupies index 1 and "P" is pushed to index 2.

Example 3 (one empty diff):

Input: diff_one = [] , diff_two = [(2, "X")]
Output: [(2, "X")]

Constraints

$0 \le n, m \le 2 \cdot 10^5$
$0 \le \text{position} \le 10^9$
Required time complexity: $O(n + m)$

Merge Two Insertion Diffs in Linear Time

Company: Jane Street

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

Formally, applying a diff D to a sequence S produces a sequence T of length len(S) + len(D):

For every pair (p, v) in D , T[p] = v .
The elements of S , in their original order, occupy the remaining indices of T in increasing index order.

For example, applying {0: "A", 3: "B"} to ["z", "z", "z"] yields ["A", "z", "z", "B", "z"].

Now suppose two diffs are applied one after the other:

diff_one is applied to the original document.
diff_two is then applied to the result of step 1 — so its positions refer to indices of the final document.

Continuing the example with the original document ["z", "z", "z"]:

diff_one = {0: "A", 3: "B"} produces ["A", "z", "z", "B", "z"] .
diff_two = {0: "C", 4: "D"} then produces ["C", "A", "z", "z", "D", "B", "z"] .

Input format

Each diff is given as a list of (position, value) pairs sorted by increasing position (equivalently, a mapping whose keys iterate in increasing order).
Positions within a single diff are distinct non-negative integers.
Values are non-empty strings and should be treated as opaque.
You may assume both diffs are valid for the (unseen) document they are applied to: every position is a valid index of the corresponding post-application document.

Output format

Return the merged diff as a list of (position, value) pairs sorted by increasing position. Its size is exactly n + m.

Examples

Example 1:

Input: diff_one = [(0, "A"), (3, "B")] , diff_two = [(0, "C"), (4, "D")]
Output: [(0, "C"), (1, "A"), (4, "D"), (5, "B")]

Example 2 (position collision between the two diffs):

Input: diff_one = [(1, "P")] , diff_two = [(1, "Q")]
Output: [(1, "Q"), (2, "P")]
Explanation: on an original document ["a", "b"] , diff_one produces ["a", "P", "b"] , and diff_two then produces ["a", "Q", "P", "b"] . diff_two 's position is absolute in the final document, so "Q" occupies index 1 and "P" is pushed to index 2.

Example 3 (one empty diff):

Input: diff_one = [] , diff_two = [(2, "X")]
Output: [(2, "X")]

Constraints

$0 \le n, m \le 2 \cdot 10^5$
$0 \le \text{position} \le 10^9$
Required time complexity: $O(n + m)$

Merge Two Insertion Diffs in Linear Time

Merge Two Insertion Diffs in Linear Time

Submit Your Answer to Earn 20XP

Merge Two Insertion Diffs in Linear Time

Merge Two Insertion Diffs in Linear Time

Submit Your Answer to Earn 20XP