PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Analytics & Experimentation/Gemini

Design offline backtest and online experiment

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competencies in fraud detection, transaction labeling, feature engineering, offline backtesting, causal inference for counterfactual estimation, and online experiment design within the Analytics & Experimentation domain for data scientists.

  • Medium
  • Gemini
  • Analytics & Experimentation
  • Data Scientist

Design offline backtest and online experiment

Company: Gemini

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: Medium

Interview Round: Onsite

You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01. Deliverables - Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31. - Online experiment plan for a policy/heuristic launch. Requirements 1) Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals. 2) Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds. 3) Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods. 4) Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty. 5) Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them. 6) Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria. 7) Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects. 8) Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.

Quick Answer: This question evaluates competencies in fraud detection, transaction labeling, feature engineering, offline backtesting, causal inference for counterfactual estimation, and online experiment design within the Analytics & Experimentation domain for data scientists.

Gemini logo
Gemini
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Analytics & Experimentation
1
0

You are given an ACH transaction-level dataset to identify and control fraud and will present a plan. Today is 2025-09-01. Deliverables

  • Offline analysis plan and backtest for data from 2025-06-01 to 2025-08-31.
  • Online experiment plan for a policy/heuristic launch.

Requirements

  1. Labeling: Define fraud labels using ach_returns with a 5-business-day label window. Prevent look-ahead bias; handle late returns and partial reversals.
  2. Features/heuristics: Propose 3–5 interpretable rules (e.g., 24h ACH velocity, shared-device across users, amount thresholds). Specify exact definitions and thresholds.
  3. Metrics: Primary = loss per 1,000 credits; Secondary = fraud prevalence, precision/recall, customer contact rate; Guardrails = ACH acceptance rate, support ticket rate, chargeback/return rate on non-ACH methods.
  4. Offline backtest: Describe sampling, cross-validation or time-split, leakage checks, and how you’ll simulate holds/blocks without affecting behavior. Show how you’ll estimate counterfactuals and uncertainty.
  5. Heterogeneity: Identify 3 cuts (e.g., tenure, country, device clusters) and how you’ll control false discovery across them.
  6. Experiment design: Unit of randomization (e.g., user-level sticky), power analysis inputs, ramp schedule, interference/spillover handling, SRM checks, and pre-specified stop/rollback criteria.
  7. Monitoring: Daily dashboards, anomaly detection, and how you’ll separate seasonal effects (e.g., end-of-month payroll) from treatment effects.
  8. Presentation: Outline 3 slides you would present (Problem & Baseline, Proposed Controls & Risk, Expected Value & Ramp Plan) and the exact readouts you expect to show.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Gemini•More Data Scientist•Gemini Data Scientist•Gemini Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.