PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/SoFi

Plan and validate ranking experiment

Last updated: Mar 29, 2026

Quick Overview

This question evaluates skills in experimental design and analytics, covering offline counterfactual replay, interleaving and A/B testing, sample-size and power computation, sequential testing and alpha spending, guardrail monitoring and ramp policies, proxy metrics and covariate adjustment, heterogeneous treatment effect analysis, and governance concerns such as p-hacking and Simpson’s paradox within the Analytics & Experimentation domain for Data Scientist roles. It is commonly asked to probe proficiency in rigorously validating ranking changes while balancing statistical error, operational risk and bias mitigation, and it emphasizes practical application of applied statistical concepts and experiment governance rather than purely theoretical understanding.

  • hard
  • SoFi
  • Analytics & Experimentation
  • Data Scientist

Plan and validate ranking experiment

Company: SoFi

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

You have a new ranking algorithm for the home page and must validate it safely. Design a three-stage evaluation plan: offline replay with IPS/DR, small-scale interleaving (team-draft), then full A/B. Be concrete: (1) Define exposure unit (impression-level vs. session-level) and bucketing to avoid contamination across sessions/devices. (2) Primary metric is 30-day funded-account conversion per 1,000 impressions; baseline = 1.20%, target relative uplift = +5%, power = 0.8, alpha = 0.05. Compute the per-arm sample size assuming independent impressions, then discuss inflation for repeated exposures and cluster-robust variance. (3) List guardrails (p95 latency, app crash rate, CS tickets, decline rate) and how you’ll set sequential boundaries (e.g., alpha spending or SPRT) to allow early stop without inflating Type I error. (4) Explain how to mitigate novelty effects, carryover, and seasonality; specify ramp policy and duration for capturing 30-day outcomes while using proxy metrics for early reads with CUPED or covariate adjustment. (5) Describe heterogeneous treatment effect analysis (new vs. existing users, credit tiers) and how you’ll control false discovery with BH or Holm. (6) Provide a plan to detect p-hacking/Simpson’s paradox and define ship criteria when primary and guardrails disagree.

Quick Answer: This question evaluates skills in experimental design and analytics, covering offline counterfactual replay, interleaving and A/B testing, sample-size and power computation, sequential testing and alpha spending, guardrail monitoring and ramp policies, proxy metrics and covariate adjustment, heterogeneous treatment effect analysis, and governance concerns such as p-hacking and Simpson’s paradox within the Analytics & Experimentation domain for Data Scientist roles. It is commonly asked to probe proficiency in rigorously validating ranking changes while balancing statistical error, operational risk and bias mitigation, and it emphasizes practical application of applied statistical concepts and experiment governance rather than purely theoretical understanding.

SoFi logo
SoFi
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
8
0

Evaluate a New Home-Page Ranking Algorithm: 3-Stage Plan

Context

You are introducing a new ranking algorithm for the home page. You must validate it safely and rigorously using a staged approach:

  1. Offline counterfactual replay using IPS/DR.
  2. Small-scale online interleaving (team-draft).
  3. Full A/B experiment.

Be concrete about experiment unit, bucketing, sample size, guardrails, sequential testing, novelty/carryover/seasonality mitigation, ramp policy, proxy metrics and covariate adjustment, heterogeneous treatment effects (HTE) with multiple-testing control, and governance against p-hacking/Simpson’s paradox.

Tasks

  1. Define the exposure unit (impression-level vs. session-level) and bucketing to avoid contamination across sessions/devices.
  2. Primary metric: 30-day funded-account conversion per 1,000 impressions. Baseline = 1.20%, target relative uplift = +5%, power = 0.8, alpha = 0.05. Compute the per-arm sample size assuming independent impressions, then discuss inflation for repeated exposures and cluster-robust variance.
  3. List guardrails (p95 latency, app crash rate, CS tickets, decline rate) and how you will set sequential boundaries (e.g., alpha spending or SPRT) to allow early stopping without inflating Type I error.
  4. Explain how to mitigate novelty effects, carryover, and seasonality; specify the ramp policy and duration for capturing 30-day outcomes while using proxy metrics for early reads with CUPED or covariate adjustment.
  5. Describe heterogeneous treatment effect analysis (new vs. existing users, credit tiers) and how you will control false discovery with BH or Holm.
  6. Provide a plan to detect p-hacking/Simpson’s paradox and define ship criteria when primary and guardrails disagree.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More SoFi•More Data Scientist•SoFi Data Scientist•SoFi Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.