Design payment-to-invoice matcher with priorities
Company: Stripe
Role: Software Engineer
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Design and implement a payment-to-invoice matcher. Inputs:
(a) invoices, a list like ["invoice-id-1, 10000, 2022-01-01", "invoice-id-2, 30000, 2022-01-01"], where amounts are integer cents;
(b) payment, a single string like "payment-id, 30000, paying for: invoice-id-1" or "payment-id, 30000" when the invoice id is absent; and
(c) an optional forgiveness value (integer cents). Output: a canonical message "{payment-id} paid {paid_amount} amount for invoice {invoice-id} on date {invoice_date}" and, when forgiveness is used, append "; forgave {difference}" indicating how much was forgiven. Matching rules and priorities:
1) If the payment explicitly contains an invoice id, match that invoice and ignore amount-based or forgiveness-based matching.
2) Otherwise, match by exact amount; if multiple invoices have that amount, pick the earliest by date; if still tied, break ties by smallest invoice-id lexicographically.
3) Otherwise, if a forgiveness value is provided, match the invoice whose amount differs from the payment by at most forgiveness; if multiple qualify, pick the earliest by date, then invoice-id.
4) If nothing matches, specify no match found. Requirements: describe your data structures for invoices and payments, your parsing approach (prefer simple substring/split over regex), and why you use integer cents instead of floats. Provide pseudocode or code for match_payment(invoices, payment, forgiveness=None). Finally, enumerate a comprehensive test suite covering: explicit id present/absent, multiple exact-amount candidates, forgiveness matches (including boundary equals and just-over-the-limit), tie-breaking by date and id, and regression tests ensuring earlier behaviors remain correct after adding forgiveness.
Quick Answer: This question evaluates parsing and data modeling skills, deterministic matching algorithms with prioritization and tie-breaking, handling monetary values as integer cents, and the ability to design comprehensive test suites for edge cases.