Implement cursor-based query pagination
Company: Coinbase
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates understanding of cursor-based pagination, deterministic ordering with tie-breakers, cursor encoding, and in-memory data-structure management for efficient filtered queries.
Part 1: In-Memory Cursor-Based Pagination by Score and ID
Constraints
- 0 <= len(rows) <= 1,000,000
- All row ids are unique strings.
- Scores are integers.
- 1 <= pageSize <= 1,000
- cursor is either None or a two-element list [score, id].
- No concurrent writes occur during a single paginated scan.
Examples
Input: ([['r2', 20, 'B'], ['r1', 10, 'A'], ['r3', 20, 'C'], ['r0', 5, 'Z']], 10, 2, None)
Expected Output: ([['r1', 10, 'A'], ['r2', 20, 'B']], [20, 'r2'])
Explanation: Rows with score >= 10 are r1, r2, and r3 after sorting. The first page returns r1 and r2, and r3 remains.
Input: ([['r2', 20, 'B'], ['r1', 10, 'A'], ['r3', 20, 'C'], ['r0', 5, 'Z']], 10, 2, [20, 'r2'])
Expected Output: ([['r3', 20, 'C']], None)
Explanation: The cursor excludes all keys <= (20, r2), so only r3 remains.
Input: ([], 0, 10, None)
Expected Output: ([], None)
Explanation: Empty input returns an empty page and no next cursor.
Input: ([['a', 1, 'x']], 5, 1, None)
Expected Output: ([], None)
Explanation: No rows satisfy score >= 5.
Input: ([['b', 5, 'B'], ['c', 5, 'C'], ['a', 5, 'A']], 5, 2, [5, 'a'])
Expected Output: ([['b', 5, 'B'], ['c', 5, 'C']], None)
Explanation: The id tie-breaker lets pagination continue inside rows with the same score.
Hints
- Sort using the same fields that define the cursor: first score, then id.
- The cursor is exclusive: a row is eligible only if (score, id) is greater than the cursor key.
Part 2: Stable and Unambiguous Cursor Encoding
Constraints
- 0 <= len(operations) <= 100,000
- Scores are integers.
- Ids are strings and may contain punctuation such as commas, pipes, colons, or spaces.
- Decode inputs are valid tokens produced by the specified format, or None.
Examples
Input: ([['encode', [10, 'rowA']], ['decode', '[10,"rowA"]']],)
Expected Output: ['[10,"rowA"]', [10, 'rowA']]
Explanation: The encoded cursor is compact JSON, and decoding restores the original score and id.
Input: ([['encode', [-5, 'a|b,c:42']], ['decode', '[-5,"a|b,c:42"]']],)
Expected Output: ['[-5,"a|b,c:42"]', [-5, 'a|b,c:42']]
Explanation: Punctuation inside the id does not confuse the encoding.
Input: ([['encode', None], ['decode', None]],)
Expected Output: [None, None]
Explanation: The null cursor stays null in both directions.
Input: ([['encode', [0, '']], ['decode', '[0,""]']],)
Expected Output: ['[0,""]', [0, '']]
Explanation: An empty string id is still represented unambiguously.
Hints
- Avoid concatenating score and id using a raw separator because ids may contain that separator.
- A structured format such as JSON preserves types and boundaries between fields.
Part 3: Paginate Safely Through Duplicate Scores
Constraints
- 0 <= len(rows) <= 1,000,000
- All ids are unique strings.
- 1 <= pageSize <= 1,000
- Rows must be ordered by score ascending, then id ascending.
- The cursor is exclusive.
Examples
Input: ([['b', 5, 'B'], ['a', 5, 'A'], ['c', 5, 'C'], ['d', 6, 'D']], 5, 2, [5, 'a'])
Expected Output: (['b', 'c'], [5, 'c'])
Explanation: The scan continues within score 5 using id order, returning b and c before d.
Input: ([['b', 5, 'B'], ['a', 5, 'A'], ['c', 5, 'C'], ['d', 6, 'D']], 5, 2, [5, 'c'])
Expected Output: (['d'], None)
Explanation: After cursor (5, c), only d remains.
Input: ([['b', 5, 'B'], ['a', 5, 'A'], ['c', 5, 'C'], ['d', 6, 'D']], 5, 3, None)
Expected Output: (['a', 'b', 'c'], [5, 'c'])
Explanation: The first page ends in the middle of the score-5 group, so the cursor includes both score and id.
Input: ([], 5, 2, None)
Expected Output: ([], None)
Explanation: Empty input has no rows to return.
Input: ([['a', 5, 'A'], ['b', 5, 'B']], 5, 2, [5, 'b'])
Expected Output: ([], None)
Explanation: The cursor is already at the final matching key.
Hints
- For tied scores, compare ids to decide whether a row comes after the cursor.
- The correct condition is key > cursorKey, not score > cursorScore.
Part 4: Indexed Pagination for Large Static Datasets
Constraints
- 0 <= len(rows) <= 1,000,000
- 0 <= len(queries) <= 100,000
- All ids are unique strings.
- 1 <= pageSize <= 1,000 for every query.
- Rows are static during the batch of queries.
- cursor is either None or [score, id]. It may be a boundary key even if that exact row is not present.
Examples
Input: ([['r2', 20, 'B'], ['r1', 10, 'A'], ['r3', 20, 'C'], ['r0', 5, 'Z']], [[10, 2, None], [10, 2, [20, 'r2']]])
Expected Output: [[[['r1', 10, 'A'], ['r2', 20, 'B']], [20, 'r2']], [[['r3', 20, 'C']], None]]
Explanation: The sorted index is reused across both queries.
Input: ([['r2', 20, 'B'], ['r1', 10, 'A'], ['r3', 20, 'C'], ['r0', 5, 'Z']], [[20, 5, None], [21, 1, None]])
Expected Output: [[[['r2', 20, 'B'], ['r3', 20, 'C']], None], [[], None]]
Explanation: The first query returns all rows with score 20; the second has no matches.
Input: ([], [[0, 1, None]])
Expected Output: [[[], None]]
Explanation: An empty dataset still supports queries.
Input: ([['a', 5, 'A'], ['e', 5, 'E'], ['c', 5, 'C'], ['z', 6, 'Z']], [[5, 2, [5, 'b']]])
Expected Output: [[[['c', 5, 'C'], ['e', 5, 'E']], [5, 'e']]]
Explanation: The cursor key does not need to exist. Binary search starts at the first key greater than (5, b).
Hints
- Precompute rows sorted by the composite key (score, id).
- For each query, combine two lower bounds: first score >= minScore, and first key > cursor.