PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Meta

Choose tests and solve distribution parameters

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in statistical inference for skewed count data, covering test selection for central tendency, negative binomial parameter estimation and zero-probability calculation, construction of confidence intervals for mean differences, and interpretation of p-values versus effect sizes within the Statistics & Math domain for a Data Scientist role. It is commonly asked to assess both conceptual understanding (assumptions, diagnostics, statistical versus practical significance, and multiple-testing considerations) and practical application (parameter solving, delta-method/GLM rationale, and robust effect-size estimation) when analyzing real-world count data.

  • hard
  • Meta
  • Statistics & Math
  • Data Scientist

Choose tests and solve distribution parameters

Company: Meta

Role: Data Scientist

Category: Statistics & Math

Difficulty: hard

Interview Round: Onsite

You are comparing engagement between new and existing users from 2025-08-05 to 2025-09-01. 1) You observe per-user daily session counts (integer, skewed, with many zeros). Which test would you use to compare central tendency between cohorts and why: two-sample t-test, Welch's t-test, Mann–Whitney U, or a GLM-based approach? State assumptions and diagnostics you would run. 2) Suppose daily sessions per user is approximately Negative Binomial with mean μ = 2.40 and variance σ² = 6.96 for existing users. Parameterize NB in terms of (r, p) where E[X] = r(1−p)/p and Var[X] = r(1−p)/p². Solve for r and p, then compute P(X = 0). 3) For new users, you estimate μ = 1.85 and σ² = 4.20. Using a delta-method or GLM reasoning, give a 95% CI for the mean difference in sessions per user between cohorts given independent samples of size n_new = 5,000 and n_exist = 5,000. State any approximations. 4) You perform a Welch's t-test and obtain p = 0.04 with Cohen's d = 0.08. Interpret practical vs statistical significance, discuss multiple-testing control if you also segmented by 5 countries, and specify one robust effect-size metric for count data (e.g., ratio of means) and how to estimate its CI.

Quick Answer: This question evaluates proficiency in statistical inference for skewed count data, covering test selection for central tendency, negative binomial parameter estimation and zero-probability calculation, construction of confidence intervals for mean differences, and interpretation of p-values versus effect sizes within the Statistics & Math domain for a Data Scientist role. It is commonly asked to assess both conceptual understanding (assumptions, diagnostics, statistical versus practical significance, and multiple-testing considerations) and practical application (parameter solving, delta-method/GLM rationale, and robust effect-size estimation) when analyzing real-world count data.

Related Interview Questions

  • Compute probability an account is fake - Meta (easy)
  • Compute Bayes probability for fake accounts - Meta (easy)
  • Compute probabilities for chatbot response quality - Meta (easy)
  • Compute posterior fake probability using Bayes' rule - Meta (medium)
  • Estimate bots and CI from DAU spike - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Statistics & Math
2
0

Engagement Comparison: New vs Existing Users (2025-08-05 → 2025-09-01)

Context: You have per-user daily session counts (integer, skewed, many zeros) for two independent cohorts: new users and existing users. Your goal is to compare central tendency across cohorts and quantify the difference.

  1. Test selection for central tendency
  • Which method would you use to compare central tendency between cohorts and why: two-sample t-test, Welch's t-test, Mann–Whitney U, or a GLM-based approach?
  • State the key assumptions and the diagnostics you would run.
  1. Negative Binomial parameterization
  • For existing users, assume daily sessions per user X ~ NB(r, p) with E[X] = r(1−p)/p and Var[X] = r(1−p)/p².
  • Given μ = 2.40 and σ² = 6.96, solve for r and p, then compute P(X = 0).
  1. CI for difference in means
  • For new users, μ = 1.85 and σ² = 4.20. With independent samples of size n_new = 5,000 and n_exist = 5,000, provide a 95% CI for the mean difference in sessions per user (new − existing) using a delta-method or GLM-based rationale. State approximations used.
  1. Interpreting t-test results and robust effect sizes
  • A Welch's t-test yields p = 0.04 with Cohen's d = 0.08. Interpret practical vs statistical significance.
  • If you also segmented by 5 countries, discuss multiple-testing control.
  • Specify one robust effect-size metric for count data (e.g., ratio of means) and how to estimate its 95% CI.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Meta•More Data Scientist•Meta Data Scientist•Meta Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.