Optimize trade PnL table updates
Company: Jane Street
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This question evaluates understanding of incremental aggregation and state management, including handling overlapping predicates, caching expensive or asynchronous computations, and algorithmic complexity considerations in time-series PnL updates.
Part 1: Maintain a Filtered PnL Table Under Trade Updates
Constraints
- 0 <= len(row_filters) <= 50
- 1 <= times_count <= 30
- 0 <= len(operations) <= 20000
- row_filters contains distinct integers
- For every ADD, tags contains distinct integers
- For every ADD, len(pnl) == times_count
- PnL values may be negative
Examples
Input: ([1, 2, 3], 2, [('ADD', [1, 2], [10, 20]), ('ADD', [3], [5, 7]), ('VIEW',), ('SET_GLOBAL', 2), ('VIEW',), ('ADD', [1, 2, 3], [1, 1]), ('VIEW',), ('SET_GLOBAL', None), ('VIEW',)])
Expected Output: [[[10, 20], [10, 20], [5, 7]], [[10, 20], [10, 20], [0, 0]], [[11, 21], [11, 21], [1, 1]], [[11, 21], [11, 21], [6, 8]]]
Explanation: The third trade overlaps all three rows. When the global filter is 2, only trades containing tag 2 contribute.
Input: ([1, 2], 3, [('VIEW',), ('SET_GLOBAL', 9), ('VIEW',), ('ADD', [9], [4, 4, 4]), ('VIEW',)])
Expected Output: [[[0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0]]]
Explanation: Edge case: no trade ever matches row filter 1 or 2, so every view is all zeros.
Input: ([1, 2], 1, [('ADD', [1, 2], [5]), ('ADD', [2], [-3]), ('SET_GLOBAL', 1), ('VIEW',), ('SET_GLOBAL', None), ('VIEW',)])
Expected Output: [[[5], [5]], [[5], [2]]]
Explanation: With global filter 1, the second trade is excluded because it lacks tag 1.
Input: ([], 2, [('VIEW',), ('ADD', [1], [3, 4]), ('VIEW',)])
Expected Output: [[], []]
Explanation: Edge case: there are no row filters, so each snapshot is an empty matrix.
Hints
- A trade with tags S contributes to every row whose tag is in S, not just one row.
- Maintain one aggregate matrix for no global filter and additional aggregate matrices keyed by global tag. Then SET_GLOBAL only needs to switch which matrix is active.
Part 2: Process Asynchronous PnL Callbacks with Caching
Constraints
- 0 <= len(row_filters) <= 50
- 1 <= times_count <= 30
- 0 <= len(operations) <= 30000
- row_filters contains distinct integers
- Each ADD uses a unique trade_id; if duplicates appear, only the first ADD matters
- For every ADD, tags contains distinct integers
- For every RESOLVE, 0 <= time_idx < times_count
- If the same (trade_id, time_idx) appears multiple times in RESOLVE operations, only the first one should affect the answer
Examples
Input: ([1, 2], 3, [('ADD', 'A', [1, 2]), ('VIEW',), ('RESOLVE', 'A', 0, 5), ('RESOLVE', 'A', 2, 9), ('VIEW',), ('SET_GLOBAL', 2), ('VIEW',), ('ADD', 'B', [2]), ('RESOLVE', 'B', 1, 4), ('VIEW',)])
Expected Output: [[[0, 0, 0], [0, 0, 0]], [[5, 0, 9], [5, 0, 9]], [[5, 0, 9], [5, 0, 9]], [[5, 0, 9], [5, 4, 9]]]
Explanation: Before any RESOLVE calls, the table is all zeros. Later callbacks update the exact cells they affect.
Input: ([1], 2, [('ADD', 'X', [9]), ('RESOLVE', 'X', 0, 7), ('VIEW',), ('SET_GLOBAL', 9), ('VIEW',)])
Expected Output: [[[0, 0]], [[0, 0]]]
Explanation: Edge case: the trade never matches row filter 1, so even resolved values do not appear in the table.
Input: ([1, 2, 3], 2, [('ADD', 'A', [1, 3]), ('ADD', 'B', [2]), ('RESOLVE', 'A', 1, 4), ('SET_GLOBAL', 3), ('VIEW',), ('RESOLVE', 'B', 0, 5), ('VIEW',), ('SET_GLOBAL', None), ('VIEW',)])
Expected Output: [[[0, 4], [0, 0], [0, 4]], [[0, 4], [0, 0], [0, 4]], [[0, 4], [5, 0], [0, 4]]]
Explanation: Trade B does not have tag 3, so its resolved value is invisible while the global filter is 3.
Input: ([1, 2], 1, [('ADD', 'A', [1, 2]), ('RESOLVE', 'A', 0, 3), ('RESOLVE', 'A', 0, 99), ('VIEW',), ('SET_GLOBAL', 2), ('VIEW',)])
Expected Output: [[[3], [3]], [[3], [3]]]
Explanation: Edge case: the second RESOLVE for the same (trade_id, time_idx) is ignored because the value is already cached.
Hints
- Store each trade's tag set when the trade is added. Then when one callback arrives, you can immediately update every row/global aggregate that this single value affects.
- Use a seen set for (trade_id, time_idx) so duplicate callbacks do not corrupt the totals.