LLMs in Fraud Detection: Near-Term vs. Long-Term Roles
Context
You are designing fraud detection for a large-scale digital payments platform with:
-
Real-time transaction scoring requiring sub-100 ms p95 latency and high availability.
-
Multiple data sources: structured transaction/device/network signals; unstructured text (support chats, claims, merchant descriptions, KYC docs, emails); and semi-structured logs.
-
A human review workflow for escalations and investigations.
Task
-
Compare large language models (LLMs) to traditional supervised models for fraud detection across:
-
Data modalities unlocked (text, logs, documents)
-
Feature engineering vs. representation learning
-
Accuracy (new pattern discovery vs. steady-state classification)
-
Latency and cost
-
Interpretability and governance
-
Robustness to adaptive adversaries
-
Privacy/compliance
-
Lifecycle operations (training, deployment, monitoring, updates)
-
Describe the near-term and long-term roles LLMs should play in this stack.
-
Propose a hybrid architecture that integrates LLMs and traditional models to balance accuracy, latency, cost, and risk.
-
Propose an evaluation plan (offline and online) to justify adoption, including metrics, ablations, cost/latency modeling, and risk controls.