Streaming Equivalence of Two Trade Feeds with Per-Company Ordering
Company: Jane Street
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Company: Jane Street
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
A market-data system receives trade events from two independent feeds. Each event is a pair (company, trade_id) identifying one trade for one company.
Two feeds are equivalent when, for every company, the ordered sequence of that company's trade ids is identical in both feeds. The relative interleaving of different companies is allowed to differ.
For example, these two feeds are equivalent:
Feed 1: ("A",1) ("A",2) ("B",1) ("B",2)
Feed 2: ("A",1) ("B",1) ("A",2) ("B",2)
Company A's trade ids appear in the order [1, 2] in both feeds, and company B's in the order [1, 2] in both — only the interleaving differs. In contrast, ("A",1) ("A",2) and ("A",2) ("A",1) are not equivalent, because company A's order differs.
Write a function
are_equivalent(feed1, feed2) -> bool
where feed1 and feed2 are lists of [company, trade_id] pairs in arrival order. Return true if the two feeds are equivalent and false otherwise.
Your solution must treat the inputs as streams and satisfy both of the following:
false
immediately without reading any further events. You may not pre-scan, sort, or fully materialize per-company sequences up front.
If both feeds are exhausted and every event has been matched, return true. If one feed ends while the other still has unmatched events — or the feeds have different lengths — they are not equivalent.
| feed1 | feed2 | result |
|---|---|---|
[["A",1],["A",2],["B",1],["B",2]] | [["A",1],["B",1],["A",2],["B",2]] | true |
[["A",1],["A",2]] | [["A",2],["A",1]] | false |
[["A",1],["B",7]] | [["B",7],["A",1]] | true |
[["A",1]] | [["A",1],["A",2]] | false |
[["A",1],["B",2]] | [["A",1],["C",2]] | false |
[] | [] | true |
0 <= len(feed1), len(feed2) <= 2 * 10^5
company
is a non-empty string of at most 10 uppercase letters
trade_id
is an integer in
[1, 10^9]
O(n)
time overall and
O(d)
auxiliary space, where
d
is the maximum number of in-flight (not-yet-matched) events at any point during processing