Validate transactions with risk rules and reporting
Company: Stripe
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates skills in CSV parsing and input validation, numerical and timestamp parsing, rule-based fraud detection and behavioral feature matching, plus error prioritization and generation of fixed-width reports.
Part 1: Verify Transaction Data Integrity
Constraints
- 0 <= number of transaction rows <= 100000
- The first row, when present, is a header.
- Each transaction row is intended to have 6 CSV fields.
- Whitespace around fields must be ignored when checking emptiness.
- Rows should be processed in input order.
Examples
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't1,u1,2026-01-22T13:45:00Z,20.50,US,VISA', 't2,u2,2026-01-22T14:00:00Z,5,CA,PAYPAL'],)
Expected Output: [['t1', 'u1', 'OK'], ['t2', 'u2', 'OK']]
Explanation: Both transaction rows have all 6 fields present after trimming.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', ' t3 , u3 , 2026-01-22T13:45:00Z , , US , VISA ', ' , u4 , 2026-01-22T13:45:00Z , 10 , US , VISA'],)
Expected Output: [['t3', 'u3', 'MISSING_FIELD'], ['', 'u4', 'MISSING_FIELD']]
Explanation: The first row has an empty amount after trimming. The second row has an empty transaction_id.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method'],)
Expected Output: []
Explanation: There are no transaction rows after the header.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't5,u5,2026-01-22T13:45:00Z,30,US,'],)
Expected Output: [['t5', 'u5', 'MISSING_FIELD']]
Explanation: The payment_method field is present but empty.
Hints
- Skip the header before validating transaction rows.
- Apply strip() to every field before checking whether it is empty.
Part 2: Validate High-Risk Transaction Rules
Constraints
- 0 <= number of transaction rows <= 100000
- min_amount <= max_amount
- The allowed amount range is inclusive.
- Payment method matching is exact after trimming the transaction field.
- Amount values should be treated as decimal numbers, not integers only.
Examples
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't1,u1,2026-01-22T10:00:00Z,50,US,VISA', 't2,u2,2026-01-22T10:00:00Z,150,US,VISA'], 10, 100, ['PAYPAL'])
Expected Output: [['t1', 'u1', 'OK'], ['t2', 'u2', 'SUSPICIOUS']]
Explanation: t1 is within range and uses an allowed method. t2 has amount greater than max_amount.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't3,u3,2026-01-22T10:00:00Z,abc,US,VISA', 't4,u4,2026-01-22T10:00:00Z,20,US,PAYPAL'], 0, 100, ['PAYPAL'])
Expected Output: [['t3', 'u3', 'SUSPICIOUS'], ['t4', 'u4', 'SUSPICIOUS']]
Explanation: t3 has an unparseable amount. t4 uses a blocked payment method.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't5,u5,2026-01-22T10:00:00Z,10,US,VISA', 't6,u6,2026-01-22T10:00:00Z,100,US,VISA'], 10, 100, [])
Expected Output: [['t5', 'u5', 'OK'], ['t6', 'u6', 'OK']]
Explanation: The min_amount and max_amount boundaries are inclusive.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method'], 0, 100, ['PAYPAL'])
Expected Output: []
Explanation: There are no transaction rows to validate.
Hints
- Convert blocked_payment_methods to a set for O(1) lookups.
- Treat an unparseable amount the same as an out-of-range amount.
Part 3: Match Transactions Against User Behavior Baselines
Constraints
- 0 <= number of transaction rows <= 100000
- Every non-empty user_id in the transactions exists in the baseline maps.
- usual_amount_ranges[user_id] contains exactly two numeric values [low, high] with low <= high.
- Amount range checks are inclusive.
- If a transaction amount or timestamp cannot be interpreted, that feature does not match.
Examples
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't1,u1,2026-01-22T09:45:00Z,25,US,VISA'], {'u1': ['US', 'CA']}, {'u1': ['MORNING']}, {'u1': [10, 50]})
Expected Output: [['t1', 'u1', '1.00', 'OK']]
Explanation: Country, MORNING bucket, and amount all match.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't2,u2,2026-01-22T13:00:00Z,30,MX,VISA'], {'u2': ['US']}, {'u2': ['AFTERNOON']}, {'u2': [10, 50]})
Expected Output: [['t2', 'u2', '0.67', 'OK']]
Explanation: Time bucket and amount match, but country does not. Ratio is 2/3, which is not below 0.5.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't3,u3,2026-01-22T23:00:00Z,30,MX,VISA'], {'u3': ['US']}, {'u3': ['MORNING']}, {'u3': [10, 50]})
Expected Output: [['t3', 'u3', '0.33', 'BEHAVIOR_MISMATCH']]
Explanation: Only amount matches, so the ratio is 1/3, which is below 0.5.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't4,u4,2026-01-22T05:59:00Z,100,JP,VISA'], {'u4': ['JP']}, {'u4': ['NIGHT']}, {'u4': [100, 100]})
Expected Output: [['t4', 'u4', '1.00', 'OK']]
Explanation: 05:59 is NIGHT, and amount range boundaries are inclusive.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method'], {}, {}, {})
Expected Output: []
Explanation: There are no transactions after the header.
Hints
- Convert each user's usual countries and usual time buckets to sets before processing rows.
- The behavior score is just the count of three boolean matches.
Part 4: Report Prioritized Fraud Error Codes
Constraints
- 0 <= number of transaction rows <= 100000
- min_amount <= max_amount and all configured ranges are inclusive.
- A missing amount can trigger both MISSING_FIELD and AMOUNT_OUT_OF_RANGE.
- Return at most two error codes per transaction.
- Rows must be processed in input order.
Examples
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't1,u1,2026-01-22T06:00:00Z,10,US,VISA'], 10, 100, ['PAYPAL'], {'u1': ['US']}, {'u1': ['MORNING']}, {'u1': [10, 10]})
Expected Output: [['t1', 'u1', 'OK']]
Explanation: All integrity, risk, and behavior checks pass. Boundary amounts are inclusive.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't2,u2,2026-01-22T13:00:00Z,999,US,PAYPAL'], 0, 100, ['PAYPAL'], {'u2': ['CA']}, {'u2': ['NIGHT']}, {'u2': [0, 10]})
Expected Output: [['t2', 'u2', 'AMOUNT_OUT_OF_RANGE,BLOCKED_PAYMENT_METHOD']]
Explanation: Amount, payment method, and behavior all fail, but only the top two priority errors are returned.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't3,u3,2026-01-22T13:00:00Z, ,US,PAYPAL'], 0, 100, ['PAYPAL'], {'u3': ['US']}, {'u3': ['AFTERNOON']}, {'u3': [0, 100]})
Expected Output: [['t3', 'u3', 'MISSING_FIELD,AMOUNT_OUT_OF_RANGE']]
Explanation: The blank amount is missing and unparseable. The blocked payment method is lower priority and is omitted due to the two-error cap.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method', 't4,u4,2026-01-22T23:00:00Z,50,MX,VISA'], 0, 100, [], {'u4': ['US']}, {'u4': ['MORNING']}, {'u4': [0, 100]})
Expected Output: [['t4', 'u4', 'BEHAVIOR_MISMATCH']]
Explanation: Only one of the three behavior features matches, so the behavior ratio is below 0.5.
Input: (['transaction_id,user_id,timestamp,amount,country,payment_method'], 0, 100, ['PAYPAL'], {}, {}, {})
Expected Output: []
Explanation: There are no transaction rows to report.
Hints
- Collect errors in a set, then scan the priority list to choose the first two present errors.
- Compute behavior mismatch independently, even if other risk errors are already present; the priority cap decides what is displayed.