Assign Reviewers from Changed Files
Company: Stripe
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates proficiency with version control diffs, repository library APIs, file I/O and CSV parsing, and data aggregation to map changed files to their owners.
Constraints
- 0 <= len(base_files), len(target_files) <= 100000
- 0 <= number of non-empty CSV rows <= 100000
- Each file_path is unique within base_files and within target_files
- file_version is an opaque string and should only be compared for equality
Examples
Input: ([['src/payments/ChargeService.java', 'a1'], ['src/payments/RefundService.java', 'b1'], ['src/common/JsonUtil.java', 'c1']], [['src/payments/ChargeService.java', 'a2'], ['src/payments/RefundService.java', 'b1'], ['src/common/JsonUtil.java', 'c2'], ['src/payments/NewService.java', 'd1']], 'src/payments/ChargeService.java,alice\nsrc/payments/RefundService.java,bob\nsrc/common/JsonUtil.java,alice\nsrc/payments/NewService.java,carol')
Expected Output: 'alice'
Explanation: Changed files are ChargeService, JsonUtil, and NewService. alice owns 2 of them, carol owns 1, so the answer is alice.
Input: ([['src/a.java', '1'], ['src/b.java', '1'], ['src/c.java', '1']], [['src/b.java', '2'], ['src/c.java', '1'], ['src/d.java', '1']], 'src/a.java,bob\nsrc/b.java,bob\nsrc/d.java,alice')
Expected Output: 'bob'
Explanation: src/a.java was deleted, src/b.java was modified, and src/d.java was added. bob owns 2 changed files and alice owns 1.
Input: ([['src/a.java', '1']], [['src/a.java', '2'], ['src/b.java', '1']], 'src/c.java,alice')
Expected Output: ''
Explanation: src/a.java and src/b.java changed, but neither appears in the ownership CSV, so no reviewer can be assigned.
Input: ([], [], '')
Expected Output: ''
Explanation: There are no files in either branch, so there are no changed files and no owner to return.
Input: ([], [['src/utils/Parser.java', 'v1']], '\nsrc/utils/Parser.java, dana \n')
Expected Output: 'dana'
Explanation: The file was added in the target branch. Blank CSV lines are ignored and surrounding spaces are trimmed, so the owner is dana.
Hints
- Convert each branch's file list into a dictionary so you can compare versions by path in O(1).
- Parse the CSV into a file_path -> owner map, then scan the union of paths from both branches and count only the changed ones.