Detecting and Quantifying Wash Trading on a Centralized Exchange
Context
You are designing an analytics approach for a centralized exchange to detect and quantify wash trading on BTC‑USD (deep liquidity) and thin‑liquidity altcoin pairs. Assume access to:
-
Full order/trade logs (orders, cancels, fills, order book snapshots), per account.
-
Device/IP/session metadata and permissioned KYC signals.
-
On‑chain deposit/withdrawal addresses linked to accounts.
-
Maker/taker fees, rebates, and any incentive program data.
-
Historical enforcement outcomes (if available).
State any minimal additional assumptions you need.
Task
Define a comprehensive approach that addresses:
a) Features from order/trade logs to flag potential wash trading, including:
-
Self‑matches, rapid round‑trips, size/price mirroring across linked accounts, and order‑book position churn.
b) Graph heuristics to infer common control among accounts (e.g., shared devices/IPs, on‑chain funding links), implemented with privacy‑preserving hashing and strict access controls.
c) Thresholds that separate legitimate market making from manipulation, including discussion of precision/recall trade‑offs and differences between BTC‑USD and thin‑liquidity pairs.
d) Backtesting methodology using synthetic injected wash trades plus any available enforcement ground truth.
e) A daily risk score with confidence intervals and calibration checks.
f) Safeguards to avoid penalizing bona fide liquidity providers and how to surface cases to Compliance for review.