Design offline backtest and online experiment

Q: Design offline backtest and online experiment

This is a Analytics & Experimentation interview question from Gemini for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Question

You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01. Deliverables

Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31.
Online experiment plan for a policy/heuristic launch.

Requirements

Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals.
Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds.
Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods.
Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty.
Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them.
Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria.
Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects.
Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.

Design offline backtest and online experiment

Comments (0)