How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a medium difficulty Analytics & Experimentation question, commonly asked during Onsite rounds at Gemini.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Gemini during technical interviews.

Design offline backtest and online experiment

Quick Overview

This question evaluates competencies in fraud detection, transaction labeling, feature engineering, offline backtesting, causal inference for counterfactual estimation, and online experiment design within the Analytics & Experimentation domain for data scientists.

You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01. Deliverables

Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31.
Online experiment plan for a policy/heuristic launch.

Requirements

Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals.
Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds.
Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods.
Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty.
Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them.
Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria.
Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects.
Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.

Quick Overview

You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01. Deliverables

Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31.
Online experiment plan for a policy/heuristic launch.

Requirements

Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals.
Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds.
Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods.
Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty.
Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them.
Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria.
Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects.
Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.

Design offline backtest and online experiment

Quick Overview

Design offline backtest and online experiment

Write your answer

Design offline backtest and online experiment

Quick Overview

Design offline backtest and online experiment

Write your answer