Merge Two Insertion Diffs in Linear Time
Company: Jane Street
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Company: Jane Street
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
A text-editing system represents changes to a document as insertion diffs. An insertion diff is a set of (position, value) pairs, where each pair means: "after this diff is applied, the element value sits at index position of the resulting document." All elements of the pre-diff document keep their relative order and fill the remaining indices from left to right.
Formally, applying a diff D to a sequence S produces a sequence T of length len(S) + len(D):
(p, v)
in
D
,
T[p] = v
.
S
, in their original order, occupy the remaining indices of
T
in increasing index order.
For example, applying {0: "A", 3: "B"} to ["z", "z", "z"] yields ["A", "z", "z", "B", "z"].
Now suppose two diffs are applied one after the other:
diff_one
is applied to the original document.
diff_two
is then applied to the result of step 1 — so its positions refer to indices of the
final
document.
Continuing the example with the original document ["z", "z", "z"]:
diff_one = {0: "A", 3: "B"}
produces
["A", "z", "z", "B", "z"]
.
diff_two = {0: "C", 4: "D"}
then produces
["C", "A", "z", "z", "D", "B", "z"]
.
Write a function merge_diffs(diff_one, diff_two) that returns a single diff equivalent to applying diff_one followed by diff_two. In the example above, the merged diff is {0: "C", 1: "A", 4: "D", 5: "B"}: applying it directly to ["z", "z", "z"] yields ["C", "A", "z", "z", "D", "B", "z"] in one step.
The document itself is never given and may be enormous, so you cannot materialize it. Your function must run in time, where is the number of insertions in diff_one and is the number of insertions in diff_two — not in time proportional to the document length or to the largest position value.
Input format
(position, value)
pairs sorted by increasing
position
(equivalently, a mapping whose keys iterate in increasing order).
Output format
Return the merged diff as a list of (position, value) pairs sorted by increasing position. Its size is exactly n + m.
Examples
Example 1:
diff_one = [(0, "A"), (3, "B")]
,
diff_two = [(0, "C"), (4, "D")]
[(0, "C"), (1, "A"), (4, "D"), (5, "B")]
Example 2 (position collision between the two diffs):
diff_one = [(1, "P")]
,
diff_two = [(1, "Q")]
[(1, "Q"), (2, "P")]
["a", "b"]
,
diff_one
produces
["a", "P", "b"]
, and
diff_two
then produces
["a", "Q", "P", "b"]
.
diff_two
's position is absolute in the final document, so
"Q"
occupies index 1 and
"P"
is pushed to index 2.
Example 3 (one empty diff):
diff_one = []
,
diff_two = [(2, "X")]
[(2, "X")]
Constraints