PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This Coding & Algorithms question for a Data Scientist evaluates string parsing and pattern matching, keyed and numeric record matching, date-based tie-breaking, and algorithmic complexity analysis, and is commonly asked to test handling of messy input formats and deterministic reconciliation rules.

  • Google
  • Coding & Algorithms
  • Data Scientist

Match payments to invoices by memo or amount

Company: Google

Role: Data Scientist

Category: Coding & Algorithms

Interview Round: Technical Screen

You are building a small payment-to-invoice matching utility. ### Data You are given: - `invoices`: a list of invoice records with: - `invoice_id` (string) - `amount_cents` (integer) - `due_date` (string, ISO-8601 date like `YYYY-MM-DD`) - `payments`: a list of payment records with: - `payment_id` (string) - `amount_cents` (integer) - `memo` (string; may contain extra whitespace) Assume `invoice_id` is unique in `invoices`. ### Part 1 (Memo → invoice_id matching) A payment memo may contain a standardized invoice reference in the form: - `"Paying off: <INVOICE_ID>"` where `<INVOICE_ID>` is the invoice id you should match to. Write a function that, for each payment, extracts the target `invoice_id` from the memo (if present), finds the corresponding invoice in `invoices`, and outputs a matched result. **Output requirement (per payment):** - If a match is found: output the `payment_id`, extracted `invoice_id`, and the matched invoice details. - If no match is found (or memo cannot be parsed): output a clear "not found" result for that payment. Handle edge cases such as extra spaces in the memo, missing/empty memo, or malformed standardized strings. ### Part 2 (Hybrid matching: memo-based OR amount-based) Extend the logic: 1) If `memo` contains the standardized prefix `"Paying off:"`, use Part 1 memo → `invoice_id` matching. 2) Otherwise, match by amount: - Find invoices where `invoice.amount_cents == payment.amount_cents`. - If there are multiple candidates, choose the invoice with the earliest `due_date`. - If there is no candidate, output: `"cannot find matching invoice for amount <amount_cents>"` (or equivalent). State the time and space complexity of your approach and justify it.

Quick Answer: This Coding & Algorithms question for a Data Scientist evaluates string parsing and pattern matching, keyed and numeric record matching, date-based tie-breaking, and algorithmic complexity analysis, and is commonly asked to test handling of messy input formats and deterministic reconciliation rules.

Match Payments to Invoices by Memo

You are building a payment-to-invoice matching utility. You are given: - `invoices`: a list of invoice records, each a dict with `invoice_id` (string), `amount_cents` (int), and `due_date` (string, ISO-8601 `YYYY-MM-DD`). `invoice_id` is unique. - `payments`: a list of payment records, each a dict with `payment_id` (string), `amount_cents` (int), and `memo` (string; may contain extra whitespace). A payment memo may contain a standardized invoice reference of the form `"Paying off: <INVOICE_ID>"`, where `<INVOICE_ID>` is the invoice id to match to. For each payment, extract the target `invoice_id` from the memo (if present), find the corresponding invoice, and produce a result. For each payment return a dict: - If matched: `{'payment_id', 'matched': True, 'invoice_id', 'invoice': <the invoice dict>}`. - If the invoice id was extracted but no invoice exists: `{'payment_id', 'matched': False, 'invoice_id', 'reason': 'invoice not found'}`. - If the memo is missing/empty, has no `"Paying off:"` prefix, or the prefix is present but the id after it is empty: `{'payment_id', 'matched': False, 'reason': 'no invoice reference in memo'}`. Handle edge cases: extra spaces around the id, empty memo, and a malformed `"Paying off:"` with nothing after it. Return the results in the same order as `payments`.

Constraints

  • 1 <= len(invoices), len(payments) (both may also be empty -> return [])
  • invoice_id is unique across invoices
  • memo may be empty, None, or contain extra whitespace around the id
  • The standardized reference, when present, is exactly the prefix 'Paying off:' followed by the invoice id

Examples

Input: ([{'invoice_id': 'INV-001', 'amount_cents': 5000, 'due_date': '2024-01-15'}, {'invoice_id': 'INV-002', 'amount_cents': 7500, 'due_date': '2024-02-01'}], [{'payment_id': 'PAY-1', 'amount_cents': 5000, 'memo': 'Paying off: INV-001'}, {'payment_id': 'PAY-2', 'amount_cents': 7500, 'memo': ' Paying off: INV-002 '}])

Expected Output: [{'payment_id': 'PAY-1', 'matched': True, 'invoice_id': 'INV-001', 'invoice': {'invoice_id': 'INV-001', 'amount_cents': 5000, 'due_date': '2024-01-15'}}, {'payment_id': 'PAY-2', 'matched': True, 'invoice_id': 'INV-002', 'invoice': {'invoice_id': 'INV-002', 'amount_cents': 7500, 'due_date': '2024-02-01'}}]

Explanation: Both memos carry a valid 'Paying off:' reference; the second has extra surrounding and internal whitespace which is stripped, yielding INV-002.

Input: ([{'invoice_id': 'INV-001', 'amount_cents': 5000, 'due_date': '2024-01-15'}], [{'payment_id': 'PAY-3', 'amount_cents': 5000, 'memo': 'Paying off: INV-999'}, {'payment_id': 'PAY-4', 'amount_cents': 5000, 'memo': ''}, {'payment_id': 'PAY-5', 'amount_cents': 5000, 'memo': 'random note'}])

Expected Output: [{'payment_id': 'PAY-3', 'matched': False, 'invoice_id': 'INV-999', 'reason': 'invoice not found'}, {'payment_id': 'PAY-4', 'matched': False, 'reason': 'no invoice reference in memo'}, {'payment_id': 'PAY-5', 'matched': False, 'reason': 'no invoice reference in memo'}]

Explanation: PAY-3 extracts INV-999 which is absent (invoice not found). PAY-4 has an empty memo and PAY-5 has no prefix, so both are 'no invoice reference'.

Input: ([], [])

Expected Output: []

Explanation: No payments to process, so the result list is empty.

Input: ([{'invoice_id': 'A1', 'amount_cents': 100, 'due_date': '2024-03-03'}], [{'payment_id': 'PX', 'amount_cents': 100, 'memo': 'Paying off:'}])

Expected Output: [{'payment_id': 'PX', 'matched': False, 'reason': 'no invoice reference in memo'}]

Explanation: The prefix is present but nothing follows it, so the extracted id is empty and treated as no reference.

Hints

  1. Build a dictionary mapping invoice_id -> invoice for O(1) lookups instead of scanning the invoice list per payment.
  2. Use str.find('Paying off:') to locate the prefix, then slice everything after it and .strip() to remove surrounding whitespace.
  3. Treat three failure modes the same 'no invoice reference' way: missing/empty memo, no prefix, and prefix present but nothing after it. A non-empty extracted id that simply isn't in the map is a distinct 'invoice not found' case.

Hybrid Payment Matching: Memo or Amount

Extend the payment-to-invoice matcher to support two matching modes. Inputs are the same as the first problem: `invoices` (each with `invoice_id`, `amount_cents`, `due_date`) and `payments` (each with `payment_id`, `amount_cents`, `memo`). `invoice_id` is unique. For each payment, in order: 1. If `memo` contains the standardized prefix `"Paying off:"`, use memo->invoice_id matching exactly as in the first problem (extract the id after the prefix, strip whitespace, look it up). 2. Otherwise, match by amount: find invoices whose `amount_cents` equals the payment's `amount_cents`. If there are multiple candidates, choose the one with the earliest `due_date`. If there is no candidate, report that no invoice matches the amount. Return one result dict per payment (same order as `payments`), tagging which method was used: - Memo match found: `{'payment_id', 'matched': True, 'method': 'memo', 'invoice_id', 'invoice'}`. - Memo prefix present but id empty: `{'payment_id', 'matched': False, 'method': 'memo', 'reason': 'no invoice reference in memo'}`. - Memo id extracted but invoice missing: `{'payment_id', 'matched': False, 'method': 'memo', 'invoice_id', 'reason': 'invoice not found'}`. - Amount match found: `{'payment_id', 'matched': True, 'method': 'amount', 'invoice_id', 'invoice'}`. - No amount match: `{'payment_id', 'matched': False, 'method': 'amount', 'reason': 'cannot find matching invoice for amount <amount_cents>'}`. State the time and space complexity of your approach.

Constraints

  • invoices and payments may each be empty -> return []
  • invoice_id is unique across invoices
  • Multiple invoices may share the same amount_cents; the earliest due_date wins
  • ISO-8601 due_date strings (YYYY-MM-DD) are lexicographically comparable, so string comparison gives correct chronological ordering
  • memo may be None, empty, or contain whitespace around the id

Examples

Input: ([{'invoice_id': 'INV-001', 'amount_cents': 5000, 'due_date': '2024-01-15'}, {'invoice_id': 'INV-002', 'amount_cents': 7500, 'due_date': '2024-02-01'}], [{'payment_id': 'PAY-1', 'amount_cents': 5000, 'memo': 'Paying off: INV-001'}, {'payment_id': 'PAY-2', 'amount_cents': 7500, 'memo': 'monthly transfer'}])

Expected Output: [{'payment_id': 'PAY-1', 'matched': True, 'method': 'memo', 'invoice_id': 'INV-001', 'invoice': {'invoice_id': 'INV-001', 'amount_cents': 5000, 'due_date': '2024-01-15'}}, {'payment_id': 'PAY-2', 'matched': True, 'method': 'amount', 'invoice_id': 'INV-002', 'invoice': {'invoice_id': 'INV-002', 'amount_cents': 7500, 'due_date': '2024-02-01'}}]

Explanation: PAY-1 has a memo reference and matches by memo. PAY-2 has no prefix, so it falls back to amount matching and finds the unique 7500-cent invoice INV-002.

Input: ([{'invoice_id': 'A', 'amount_cents': 1000, 'due_date': '2024-05-10'}, {'invoice_id': 'B', 'amount_cents': 1000, 'due_date': '2024-03-01'}, {'invoice_id': 'C', 'amount_cents': 1000, 'due_date': '2024-04-20'}], [{'payment_id': 'P1', 'amount_cents': 1000, 'memo': 'no reference here'}])

Expected Output: [{'payment_id': 'P1', 'matched': True, 'method': 'amount', 'invoice_id': 'B', 'invoice': {'invoice_id': 'B', 'amount_cents': 1000, 'due_date': '2024-03-01'}}]

Explanation: Three invoices share amount 1000; with no memo reference, amount matching picks the earliest due_date (2024-03-01), which is invoice B.

Input: ([{'invoice_id': 'X', 'amount_cents': 200, 'due_date': '2024-01-01'}], [{'payment_id': 'PN', 'amount_cents': 999, 'memo': ''}, {'payment_id': 'PM', 'amount_cents': 200, 'memo': 'Paying off: NOPE'}])

Expected Output: [{'payment_id': 'PN', 'matched': False, 'method': 'amount', 'reason': 'cannot find matching invoice for amount 999'}, {'payment_id': 'PM', 'matched': False, 'method': 'memo', 'invoice_id': 'NOPE', 'reason': 'invoice not found'}]

Explanation: PN has no memo prefix and no invoice with amount 999, so amount matching fails. PM has a memo reference to NOPE which does not exist, so memo matching reports invoice not found.

Input: ([], [])

Expected Output: []

Explanation: No payments to process, so the result list is empty.

Hints

  1. Precompute two structures from invoices: a dict invoice_id -> invoice, and a dict amount_cents -> list of invoices. This avoids re-scanning all invoices for every payment.
  2. Branch on whether 'Paying off:' is in the memo: if so reuse the first problem's memo logic; otherwise go to the amount bucket.
  3. For the amount tiebreak, since due_date is ISO-8601 (YYYY-MM-DD), plain string min() on due_date already yields the earliest date with no date parsing.
  4. Tag every result with 'method' ('memo' or 'amount') so callers can tell which path produced the match.
Last updated: Jun 21, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Infection Spread on a Grid (Cellular Automaton) - Google (hard)
  • Boolean Expression Tree with Leaf Flips - Google (medium)
  • Streaming Points: Remove Any Pair Within a Distance - Google (medium)
  • Most Active Users in a Live Communication Stream - Google (medium)
  • Solve Rooms and Top-K Streams - Google (medium)