This question evaluates string parsing, record linkage, and algorithmic matching skills for data reconciliation tasks, including handling messy memo text, exact amount matches, and tie-breaking by earliest due date.
You are building a small reconciliation tool that matches payments to invoices.
Assume you are given:
invoices
: a list of invoice objects/records with fields:
invoice_id: string
(unique)
amount: int
(or
decimal
; assume exact match is possible)
due_date: string
in ISO format
YYYY-MM-DD
payments
: a list of payment objects/records with fields:
payment_id: string
(unique)
amount: int
memo: string
(free-form text; may contain extra spaces)
Some payments have a standard memo format that includes an invoice id, e.g.:
"Paying off: INV-12345 ..."
If the memo contains the substring "Paying off:", then the invoice id immediately following it (after trimming spaces) is the target invoice_id. (You may assume the invoice id token ends at the next whitespace.)
Implement matching logic to process each payment and produce a match result.
For payments whose memo contains "Paying off:":
invoice_id
.
"cannot find invoice_id=..."
).
Extend the logic as follows:
"Paying off:"
, use the
ID-based
logic from Part 1.
invoice.amount == payment.amount
.
"cannot find matching invoice for amount=..."
).
due_date
.
Return a list of match results (one per payment), where each result includes:
payment_id
matched_invoice_id
(or
null
if unmatched)
match_mode
: one of
{ "id", "amount" }
(or
"unmatched"
)
error_message
for unmatched cases
memo
around
"Paying off:"
and/or the invoice id token.