PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Google

Infer distribution and choose robust statistics

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's ability to infer underlying distributions from summary statistics and apply robust statistical reasoning including handling zero-inflation, tail behavior, transformation choices, confidence-interval construction, hypothesis test selection, and sensitivity analysis for skewed revenue data.

  • Medium
  • Google
  • Statistics & Math
  • Data Scientist

Infer distribution and choose robust statistics

Company: Google

Role: Data Scientist

Category: Statistics & Math

Difficulty: Medium

Interview Round: Technical Screen

A dataset of n=10,000 session revenues (USD) has: 65% zeros; mean=8.5; median=0; p90=30; p95=120; p99=620. (a) Propose a plausible generative model (e.g., zero-inflated log-normal with a Pareto tail) and justify it from the summaries. (b) Outline two quick validation checks (e.g., QQ-plot after log1p, Hill estimator for tail index) and what patterns you expect. (c) Choose a transformation for modeling incremental lift and defend it. (d) Construct an approximate 95% CI for the median using order-statistic/binomial bounds; provide the indices and numeric bounds you’d report. (e) For an A/B test with equal n and an observed +3% lift in the mean but no change in the median, pick an appropriate hypothesis test and explain why a Welch t-test might mislead; quantify when a permutation or Wilcoxon test is preferable. (f) Under your chosen model, estimate P(revenue > 50) and discuss sensitivity to tail assumptions.

Quick Answer: This question evaluates a candidate's ability to infer underlying distributions from summary statistics and apply robust statistical reasoning including handling zero-inflation, tail behavior, transformation choices, confidence-interval construction, hypothesis test selection, and sensitivity analysis for skewed revenue data.

Related Interview Questions

  • Measure Bird Species Segregation - Google (medium)
  • Estimate weather’s effect on mental health - Google (easy)
  • Explain Bootstrap and Statistical Inference - Google (hard)
  • Explain Bootstrap and Prove Uniformity - Google (hard)
  • Can bootstrap help reduce variance - Google (medium)
Google logo
Google
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Statistics & Math
7
0
Loading...

A dataset of n=10,000 session revenues (USD) has: 65% zeros; mean=8.5; median=0; p90=30; p95=120; p99=620. (a) Propose a plausible generative model (e.g., zero-inflated log-normal with a Pareto tail) and justify it from the summaries. (b) Outline two quick validation checks (e.g., QQ-plot after log1p, Hill estimator for tail index) and what patterns you expect. (c) Choose a transformation for modeling incremental lift and defend it. (d) Construct an approximate 95% CI for the median using order-statistic/binomial bounds; provide the indices and numeric bounds you’d report. (e) For an A/B test with equal n and an observed +3% lift in the mean but no change in the median, pick an appropriate hypothesis test and explain why a Welch t-test might mislead; quantify when a permutation or Wilcoxon test is preferable. (f) Under your chosen model, estimate P(revenue > 50) and discuss sensitivity to tail assumptions.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Google•More Data Scientist•Google Data Scientist•Google Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.