Evaluates CSV parsing, numeric conversions and rounding, map lookups with fallbacks, transactional fee computation and per-merchant aggregation in the Coding & Algorithms category for Data Engineer roles.
You are given a CSV string representing a list of payment transactions. Write a function that parses the CSV and returns a CSV string with the total processing fee per merchant.
The input has a header and the following columns:
transaction_id
(string)
merchant_id
(string) — the “person” you should aggregate fees for
status
(string) — one of:
payment_completed
,
payment_pending
,
payment_failed
,
refund_completed
,
dispute_won
,
dispute_lost
payment_provider
(string) — e.g.,
card
,
bank_transfer
,
paypal
buyer_country
(string, ISO-2) — e.g.,
US
,
DE
,
FR
currency
(string, ISO-3) — e.g.,
USD
,
EUR
amount
(decimal) — transaction amount in the given
currency
Assume:
You are also given (as in-memory dictionaries/maps):
base_rate_by_provider[payment_provider] -> decimal
payment_completed
.
completed_rate_by_provider_and_country[(payment_provider, buyer_country)] -> decimal
status = 'payment_completed'
.
fx_to_usd[currency] -> decimal
currency
by this factor to convert to USD.
Fee calculation per transaction:
status = 'refund_completed'
: fee is
0
(no refund fee).
status = 'payment_completed'
:
amount
to USD:
amount_usd = amount * fx_to_usd[currency]
.
fee_usd = amount_usd * completed_rate + 0.30
where
completed_rate
comes from
completed_rate_by_provider_and_country
.
payment_pending
,
payment_failed
,
dispute_won
,
dispute_lost
):
amount
to USD:
amount_usd = amount * fx_to_usd[currency]
.
fee_usd = amount_usd * base_rate + 0.30
where
base_rate
comes from
base_rate_by_provider
.
Lookups and fallbacks:
(payment_provider, buyer_country)
is missing for a completed payment, fall back to
base_rate_by_provider[payment_provider]
.
payment_provider
is unknown, treat its rate as
0
(still apply the
$0.30
fixed fee for non-refunds).
currency
is unknown, assume
fx_to_usd[currency] = 1.0
.
Return a CSV string with header:
merchant_id,total_fee_usd
Where total_fee_usd is the sum of fee_usd across that merchant’s transactions, rounded to 2 decimals. Order rows by merchant_id ascending.
Implement a function (in a language of your choice):
compute_fees_per_merchant(csv_str, base_rate_by_provider, completed_rate_by_provider_and_country, fx_to_usd) -> csv_str
Discuss time complexity and how you’d test edge cases (e.g., empty input, malformed rows, missing lookup keys).