This question evaluates a data scientist's ability to define primary, diagnostic, and guardrail metrics for a product launch and to design rule-based suspicious-transaction detection using available schema, emphasizing metric definition, defensible proxy selection, SQL-based data modeling, and financial-crime domain knowledge.
You work on a fintech product with these existing tables (UTC timestamps). You may only use these tables/columns; if a metric cannot be measured directly, you must propose a defensible proxy using available data.
usersuser_id
BIGINT PRIMARY KEY
create_date
TIMESTAMP
country
VARCHAR
transactionstransaction_id
BIGINT PRIMARY KEY
user_id
BIGINT REFERENCES
users(user_id)
transaction_time
TIMESTAMP
product
VARCHAR -- includes values like
'crypto'
, and may include
'ultra'
if the plan is represented as a product
amount_gbp
NUMERIC(18,2)
status
VARCHAR --
'completed'
/
'declined'
ip_country
VARCHAR
activityuser_id
BIGINT REFERENCES
users(user_id)
event_time
TIMESTAMP
product
VARCHAR -- may include
'ultra'
event_type
VARCHAR --
'view'
/
'click'
A new Ultra subscription plan launches. In a 1-month evaluation window after launch:
You partner with the Financial Crime team to flag suspicious behavior.
Required output for Scenario B (choose one and state which):
user_id
s with a
risk_score
and the reasons/rules triggered,
or
transaction_id
s with flags/reasons.
Call out false-positive/false-negative tradeoffs and how you would evaluate and iterate.