Define Ultra success and detect suspicious transactions
Company: Bytedance
Role: Data Analyst
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
Assume the following relevant columns are available, and all timestamps are stored in UTC:
- `users(user_id BIGINT PRIMARY KEY, create_date TIMESTAMP, country VARCHAR)`: account creation time and registration country.
- `transactions(transaction_id BIGINT PRIMARY KEY, user_id BIGINT, product VARCHAR, amount_gbp DECIMAL(12,2), status VARCHAR, transaction_time TIMESTAMP)`: one row per transaction.
- `activity(user_id BIGINT, product VARCHAR, event_type VARCHAR, event_time TIMESTAMP, ip_country VARCHAR)`: one row per product interaction, including the most recent observed IP country.
Answer both scenario questions below.
1. **Ultra subscription launch**
Revolut launches a new paid subscription plan called **Ultra**. There is no dedicated subscription table, so you must infer adoption from the existing tables, for example via `product = 'ultra'` in transactions and activity.
Define what a **successful launch** means in the first month and describe the SQL analyses you would build. Your answer should include:
- one primary success metric and several supporting metrics,
- denominator choices such as all users vs active users vs new users, and the trade-offs of each,
- segmentation by country and user tenure,
- how to estimate Ultra DAU/WAU/MAU, adoption, penetration, and early retention,
- how to avoid misleading conclusions caused by seasonality, country mix shifts, or self-selection.
2. **Rule-based suspicious transaction detection**
You are partnering with the Fincrime team to build a **rule-based SQL system** that flags suspicious users or transactions for manual review.
Using only the available tables, translate the vague concept of “suspicious” into executable rules. Consider behaviors such as:
- unusually large amounts,
- bursts of transactions in a short time window,
- repeated small probing attempts,
- sudden activity across many products,
- mismatch between registration country and recent IP country.
Define:
- the alert table you would output,
- example rules and thresholds,
- how to combine rules into a risk score or review queue,
- how to evaluate precision, recall, and false positives.
Quick Answer: This question evaluates a candidate's skills in product analytics, cohort and retention measurement, SQL-based metric design and segmentation, and rule-based fraud detection using transactional and activity data.