Marketplace Data Store: Order Matching and End-of-Day Reconciliation
Company: Jane Street
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
Company: Jane Street
Role: Machine Learning Engineer
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
You are building the core of a marketplace where sellers list individual units of items for sale and buyers submit buy orders.
Each listing represents one unit for sale and has the fields:
item_name
(string) — the name of the item,
seller
(string) — the seller's name,
price
(integer) — the asking price for this unit,
buyer
(optional string) —
null
while the unit is unsold; set to the buyer's name once the unit is sold.
All listings live in a DataStore that supports two operations:
update(listing)
— insert a new listing or modify an existing one (for example, setting
buyer
when a unit is sold),
getAll()
— return all current listings.
Implement the DataStore and the two tasks below.
You are given the current contents of the DataStore and a collection of buy orders, at most one per item name: a map item_name -> (buyer_name, limit_price), meaning the named buyer wants to purchase exactly one unit of item_name and is willing to pay at most limit_price.
Process each buy order as follows:
buyer == null
) whose
item_name
matches the order.
price <= limit_price
. If several qualifying listings tie at that lowest price, fill the one that was inserted into the
DataStore
earliest.
price
(not the buyer's limit price). Mark the listing as sold by setting its
buyer
to
buyer_name
via
update
.
DataStore
is unchanged for that order.
Return a map item_name -> final_price containing one entry for every order that filled. Orders that did not fill do not appear in the result.
At the end of the trading day you receive the full day's transaction log. Each log record has:
buyer_name
(string),
item_name
(string),
price
(integer) — for
OK
and
PAYMENT_FAILED
records, the price the buyer transacted at; for
OUT_OF_STOCK
records, the maximum price the buyer was willing to pay,
timestamp
(integer) — all timestamps are distinct,
status
— one of:
OK
: the buyer bought a unit of
item_name
at
price
and the payment settled,
PAYMENT_FAILED
: the buyer bought a unit of
item_name
at
price
, but the payment later bounced (e.g., a credit-card problem), so the purchase is voided,
OUT_OF_STOCK
: the buyer wanted one unit of
item_name
for at most
price
, but no unit was available at that moment, so nothing was bought.
Reconcile the day by scanning the records in increasing timestamp order while maintaining a pool of re-opened units (units returned to the market during this scan). Apply these rules:
PAYMENT_FAILED
at price
p
: the unit the buyer had bought re-opens for sale — add the unit
(item_name, p)
to the pool. The buyer was never successfully charged, so they receive no billing adjustment.
OK
at price
p
: if the pool contains at least one re-opened unit of the same
item_name
with price
strictly less
than
p
, the buyer should have gotten that cheaper unit instead. Reassign the buyer to the cheapest such pool unit (earliest-added wins ties), whose price is
q
: record a
refund
of
p - q
for this buyer, remove that unit from the pool, and add the buyer's original unit
(item_name, p)
to the pool (it re-opens for sale). If the pool has no unit of this item cheaper than
p
, nothing happens.
OUT_OF_STOCK
with limit price
b
: if the pool contains at least one re-opened unit of the same
item_name
with price
q <= b
, the buyer's unfilled demand can now be satisfied. Assign the buyer the cheapest such pool unit (earliest-added wins ties): record a
charge
of
q
for this buyer and remove the unit from the pool. If no pool unit qualifies, nothing happens.
Note that a re-opened unit can only affect records that come after it in timestamp order: an OUT_OF_STOCK order whose timestamp precedes every relevant PAYMENT_FAILED record is never charged.
Return the list of billing adjustments as (buyer_name, adjustment_type, amount) triples, where adjustment_type is "refund" or "charge", in the order the adjustments are produced by the scan. A buyer may appear more than once if several of their records trigger adjustments.
Transaction log (already shown in timestamp order):
| buyer | item | price | timestamp | status |
|---|---|---|---|---|
| Alice | itemA | 100 | 1000 | PAYMENT_FAILED |
| Bob | itemA | 110 | 1010 | OK |
| Cathy | itemA | 105 | 1020 | OK |
(itemA, 100)
re-opens.
itemA
at 100 (< 110) is in the pool: Bob is reassigned to it and refunded
110 - 100 = 10
; his original unit
(itemA, 110)
re-opens.
itemA
costs 110, which is not cheaper than 105, so nothing happens.
Output: [("Bob", "refund", 10)]
| buyer | item | price | timestamp | status |
|---|---|---|---|---|
| Alice | itemA | 100 | 1000 | PAYMENT_FAILED |
| David | itemA | 110 | 1020 | OUT_OF_STOCK |
(itemA, 100)
re-opens.
itemA
for at most 110 but found no stock; the pool now holds
(itemA, 100)
with
100 <= 110
, so David gets that unit and is charged 100.
Output: [("David", "charge", 100)]
If instead David's OUT_OF_STOCK record had a timestamp before Alice's failure, no unit would be in the pool at his position in the scan, and David would not be charged.
1 <= number of listings, buy orders, log records <= 10^5
1 <= price, limit_price <= 10^9
item_name
,
seller
,
buyer_name
) are non-empty case-sensitive strings.
O((n + m) log (n + m))
where
n
is the number of listings/orders and
m
is the number of log records.