Aggregate loans and match repayments
Company: Affirm
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to aggregate transactional loan records and match repayment transactions to outstanding loans, testing competencies in data aggregation, ordered matching, stateful handling of partial and split payments, and analysis of time and space complexity.
Part 1: Aggregate Raw Loan Records
Constraints
- 0 <= len(records) <= 200000
- 1 <= amount <= 10^9
- created_at is a comparable integer timestamp
- All records with the same loan_id belong to the same customer_id
Examples
Input: [{'loan_id': 'L2', 'customer_id': 'C2', 'amount': 40, 'created_at': 2}, {'loan_id': 'L1', 'customer_id': 'C1', 'amount': 100, 'created_at': 5}, {'loan_id': 'L1', 'customer_id': 'C1', 'amount': 50, 'created_at': 3}, {'loan_id': 'L2', 'customer_id': 'C2', 'amount': 60, 'created_at': 4}]
Expected Output: [{'loan_id': 'L1', 'customer_id': 'C1', 'created_at': 3, 'total_amount': 150}, {'loan_id': 'L2', 'customer_id': 'C2', 'created_at': 2, 'total_amount': 100}]
Explanation: L1 combines to 150 with earliest timestamp 3, and L2 combines to 100 with earliest timestamp 2.
Input: [{'loan_id': 'L9', 'customer_id': 'C7', 'amount': 1, 'created_at': 999}]
Expected Output: [{'loan_id': 'L9', 'customer_id': 'C7', 'created_at': 999, 'total_amount': 1}]
Explanation: A single record stays unchanged except that amount becomes total_amount.
Input: []
Expected Output: []
Explanation: No records means no aggregated loans.
Input: [{'loan_id': 'A', 'customer_id': 'X', 'amount': 10, 'created_at': 8}, {'loan_id': 'A', 'customer_id': 'X', 'amount': 15, 'created_at': 8}, {'loan_id': 'A', 'customer_id': 'X', 'amount': 5, 'created_at': 2}, {'loan_id': 'B', 'customer_id': 'X', 'amount': 7, 'created_at': 3}]
Expected Output: [{'loan_id': 'A', 'customer_id': 'X', 'created_at': 2, 'total_amount': 30}, {'loan_id': 'B', 'customer_id': 'X', 'created_at': 3, 'total_amount': 7}]
Explanation: A appears three times and aggregates to total 30 with earliest created_at 2.
Hints
- Use a hash map keyed by loan_id to accumulate information as you scan the records once.
- For each repeated loan_id, update the running sum and keep the minimum created_at.
Part 2: Match Repayment Transactions to Loans
Constraints
- 0 <= len(loans), len(transactions) <= 200000
- Each loan_id is unique in loans
- Each txn_id is unique in transactions
- 1 <= total_amount, amount <= 10^9
- Transactions are applied in ascending timestamp order; if tied, use lexicographic txn_id order
- Within each customer, loans are paid in ascending created_at order; if tied, use lexicographic loan_id order
Examples
Input: ([{'loan_id': 'L1', 'customer_id': 'C1', 'created_at': 1, 'total_amount': 100}, {'loan_id': 'L2', 'customer_id': 'C1', 'created_at': 3, 'total_amount': 80}], [{'txn_id': 'T1', 'customer_id': 'C1', 'amount': 50, 'timestamp': 10}, {'txn_id': 'T2', 'customer_id': 'C1', 'amount': 100, 'timestamp': 11}])
Expected Output: ([('T1', 'L1', 50), ('T2', 'L1', 50), ('T2', 'L2', 50)], [{'loan_id': 'L1', 'remaining_balance': 0}, {'loan_id': 'L2', 'remaining_balance': 30}])
Explanation: The first transaction partially pays L1. The second finishes L1 and then pays part of L2.
Input: ([{'loan_id': 'L2', 'customer_id': 'C1', 'created_at': 5, 'total_amount': 70}, {'loan_id': 'L1', 'customer_id': 'C1', 'created_at': 2, 'total_amount': 30}, {'loan_id': 'L3', 'customer_id': 'C2', 'created_at': 1, 'total_amount': 40}], [{'txn_id': 'T2', 'customer_id': 'C1', 'amount': 50, 'timestamp': 4}, {'txn_id': 'T1', 'customer_id': 'C2', 'amount': 10, 'timestamp': 3}, {'txn_id': 'T3', 'customer_id': 'C1', 'amount': 20, 'timestamp': 6}, {'txn_id': 'T4', 'customer_id': 'C2', 'amount': 50, 'timestamp': 8}])
Expected Output: ([('T1', 'L3', 10), ('T2', 'L1', 30), ('T2', 'L2', 20), ('T3', 'L2', 20), ('T4', 'L3', 30)], [{'loan_id': 'L1', 'remaining_balance': 0}, {'loan_id': 'L2', 'remaining_balance': 30}, {'loan_id': 'L3', 'remaining_balance': 0}])
Explanation: Transactions are reordered by timestamp, and each customer's loans are paid oldest first.
Input: ([{'loan_id': 'L1', 'customer_id': 'C1', 'created_at': 1, 'total_amount': 25}], [])
Expected Output: ([], [{'loan_id': 'L1', 'remaining_balance': 25}])
Explanation: With no transactions, every loan keeps its full balance.
Input: ([], [{'txn_id': 'T1', 'customer_id': 'C1', 'amount': 100, 'timestamp': 1}])
Expected Output: ([], [])
Explanation: A transaction for a customer with no loans produces no matches.
Input: ([{'loan_id': 'L1', 'customer_id': 'C1', 'created_at': 1, 'total_amount': 50}, {'loan_id': 'L2', 'customer_id': 'C1', 'created_at': 2, 'total_amount': 50}], [{'txn_id': 'T2', 'customer_id': 'C1', 'amount': 30, 'timestamp': 5}, {'txn_id': 'T1', 'customer_id': 'C1', 'amount': 40, 'timestamp': 5}])
Expected Output: ([('T1', 'L1', 40), ('T2', 'L1', 10), ('T2', 'L2', 20)], [{'loan_id': 'L1', 'remaining_balance': 0}, {'loan_id': 'L2', 'remaining_balance': 30}])
Explanation: The timestamps tie, so T1 is processed before T2 because txn_id is the tie-breaker.
Hints
- Group loans by customer and sort each customer's loans by the repayment priority order.
- Keep a pointer to the first unpaid loan for each customer so you do not rescan fully paid loans.