PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates algorithm design and sequence-manipulation skills, focusing on position-mapped insertions, order-preserving merging, and attention to time/space complexity.

  • medium
  • Jane Street
  • Coding & Algorithms
  • Software Engineer

Merge Two Insertion Diffs in Linear Time

Company: Jane Street

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

A text-editing system represents changes to a document as **insertion diffs**. An insertion diff is a set of `(position, value)` pairs, where each pair means: "after this diff is applied, the element `value` sits at index `position` of the resulting document." All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right. Formally, applying a diff `D` to a sequence `S` produces a sequence `T` of length `len(S) + len(D)`: - For every pair `(p, v)` in `D`, `T[p] = v`. - The elements of `S`, in their original order, occupy the remaining indices of `T` in increasing index order. For example, applying `{0: "A", 3: "B"}` to `["z", "z", "z"]` yields `["A", "z", "z", "B", "z"]`. Now suppose two diffs are applied one after the other: 1. `diff_one` is applied to the original document. 2. `diff_two` is then applied to the result of step 1 — so its positions refer to indices of the **final** document. Continuing the example with the original document `["z", "z", "z"]`: - `diff_one = {0: "A", 3: "B"}` produces `["A", "z", "z", "B", "z"]`. - `diff_two = {0: "C", 4: "D"}` then produces `["C", "A", "z", "z", "D", "B", "z"]`. Write a function `merge_diffs(diff_one, diff_two)` that returns a **single** diff equivalent to applying `diff_one` followed by `diff_two`. In the example above, the merged diff is `{0: "C", 1: "A", 4: "D", 5: "B"}`: applying it directly to `["z", "z", "z"]` yields `["C", "A", "z", "z", "D", "B", "z"]` in one step. The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in $O(n + m)$ time, where $n$ is the number of insertions in `diff_one` and $m$ is the number of insertions in `diff_two` — not in time proportional to the document length or to the largest position value. **Input format** - Each diff is given as a list of `(position, value)` pairs sorted by increasing `position` (equivalently, a mapping whose keys iterate in increasing order). - Positions within a single diff are distinct non-negative integers. - Values are non-empty strings and should be treated as opaque. - You may assume both diffs are valid for the (unseen) document they are applied to: every position is a valid index of the corresponding post-application document. **Output format** Return the merged diff as a list of `(position, value)` pairs sorted by increasing `position`. Its size is exactly `n + m`. **Examples** Example 1: - Input: `diff_one = [(0, "A"), (3, "B")]`, `diff_two = [(0, "C"), (4, "D")]` - Output: `[(0, "C"), (1, "A"), (4, "D"), (5, "B")]` Example 2 (position collision between the two diffs): - Input: `diff_one = [(1, "P")]`, `diff_two = [(1, "Q")]` - Output: `[(1, "Q"), (2, "P")]` - Explanation: on an original document `["a", "b"]`, `diff_one` produces `["a", "P", "b"]`, and `diff_two` then produces `["a", "Q", "P", "b"]`. `diff_two`'s position is absolute in the final document, so `"Q"` occupies index 1 and `"P"` is pushed to index 2. Example 3 (one empty diff): - Input: `diff_one = []`, `diff_two = [(2, "X")]` - Output: `[(2, "X")]` **Constraints** - $0 \le n, m \le 2 \cdot 10^5$ - $0 \le \text{position} \le 10^9$ - Required time complexity: $O(n + m)$

Quick Answer: This question evaluates algorithm design and sequence-manipulation skills, focusing on position-mapped insertions, order-preserving merging, and attention to time/space complexity.

A text-editing system represents changes to a document as **insertion diffs**. An insertion diff is a set of `(position, value)` pairs, where each pair means: after this diff is applied, the element `value` sits at index `position` of the resulting document. All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right. Formally, applying a diff `D` to a sequence `S` produces a sequence `T` of length `len(S) + len(D)`: - For every pair `(p, v)` in `D`, `T[p] = v`. - The elements of `S`, in their original order, occupy the remaining indices of `T` in increasing index order. For example, applying `{0: "A", 3: "B"}` to `["z", "z", "z"]` yields `["A", "z", "z", "B", "z"]`. Now suppose two diffs are applied one after the other: 1. `diff_one` is applied to the original document. 2. `diff_two` is then applied to the result of step 1 — so its positions refer to indices of the **final** document. Continuing with the original document `["z", "z", "z"]`: `diff_one = {0: "A", 3: "B"}` produces `["A", "z", "z", "B", "z"]`, and `diff_two = {0: "C", 4: "D"}` then produces `["C", "A", "z", "z", "D", "B", "z"]`. Write `merge_diffs(diff_one, diff_two)` returning a **single** diff equivalent to applying `diff_one` followed by `diff_two`. For the example, the merged diff is `[(0, "C"), (1, "A"), (4, "D"), (5, "B")]`: applying it directly to `["z", "z", "z"]` yields the same final document in one step. The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in **O(n + m)** time, where `n = len(diff_one)` and `m = len(diff_two)` — not proportional to the document length or to the largest position value. **Input**: each diff is a list of `(position, value)` pairs sorted by increasing `position`; positions within one diff are distinct non-negative integers; values are opaque non-empty strings. Both diffs are valid for their (unseen) document. **Output**: the merged diff as a list of `(position, value)` pairs sorted by increasing `position`; its size is exactly `n + m`.

Constraints

  • 0 <= n, m <= 2 * 10^5 (n = len(diff_one), m = len(diff_two))
  • 0 <= position <= 10^9
  • Positions within a single diff are distinct; each diff is sorted by increasing position
  • Values are opaque non-empty strings
  • Required time complexity: O(n + m) — you may not materialize the document

Examples

Input: ([(0, 'A'), (3, 'B')], [(0, 'C'), (4, 'D')])

Expected Output: [(0, 'C'), (1, 'A'), (4, 'D'), (5, 'B')]

Explanation: The worked example. C takes final index 0, pushing A (was at T1 index 0) to 1; D takes index 4; B (was at T1 index 3) lands at 3 + 2 = 5.

Input: ([(1, 'P')], [(1, 'Q')])

Expected Output: [(1, 'Q'), (2, 'P')]

Explanation: Position collision. On doc ['a','b'], diff_one gives ['a','P','b']; diff_two's index 1 is absolute, so Q takes index 1 and P is pushed to index 2 (1 + d, d=1).

Input: ([], [(2, 'X')])

Expected Output: [(2, 'X')]

Explanation: diff_one is empty, so the merged diff is just diff_two unchanged.

Input: ([], [])

Expected Output: []

Explanation: Both diffs empty — nothing to insert.

Input: ([(0, 'A'), (3, 'B')], [])

Expected Output: [(0, 'A'), (3, 'B')]

Explanation: diff_two is empty (d stays 0), so every diff_one entry keeps its original position.

Input: ([(1, 'A'), (4, 'B')], [(0, 'C'), (3, 'D'), (7, 'E')])

Expected Output: [(0, 'C'), (2, 'A'), (3, 'D'), (6, 'B'), (7, 'E')]

Explanation: Ground-truth check on a len-4 original document. Intermediate T1 = [z,A,z,z,B,z]; applying diff_two yields final [C,z,A,D,z,z,B,E,z]. So A moves from T1 index 1 to 1+1=2, B moves from T1 index 4 to 4+2=6, and C,D,E sit at 0,3,7.

Hints

  1. Do not build the document. Its length is unknown and can be far larger than n + m.
  2. Think about where each diff_one insertion ends up in the FINAL document. The diff_two entries are the only things that shift it, and each shift is exactly +1.
  3. Track d = how many diff_two entries have been emitted so far. A diff_one entry originally at T1 index p lands at final index p + d.
  4. Merge the two sorted streams: at each step compare the diff_two position q with p + d. If q <= p + d, emit the diff_two entry (and increment d); otherwise emit the diff_one entry at p + d. Handle ties (q == p + d) in favor of diff_two — its slot is reserved.
Last updated: Jul 2, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Collapsible Code Editor: Brace Matching and Toggle - Jane Street (medium)
  • Implement a Circular Buffer - Jane Street (medium)
  • Code Editor with Block Shrink and Expand (Code Folding) - Jane Street (medium)
  • Optimize trade PnL table updates - Jane Street (hard)
  • Transform sparse time-code stream to dense rows - Jane Street (easy)