PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Statistics & Math/Meta

Compare first-score vs all-scores estimators

Last updated: Mar 29, 2026

Quick Overview

This question evaluates statistical estimation and inference competencies—specifically understanding estimator definitions, weighting and sampling effects, bias risks when response counts correlate with latent outcomes, intracluster correlation, variance estimation and cluster-robust standard errors, and design-effect calculations.

  • Medium
  • Meta
  • Statistics & Math
  • Data Scientist

Compare first-score vs all-scores estimators

Company: Meta

Role: Data Scientist

Category: Statistics & Math

Difficulty: Medium

Interview Round: Technical Screen

You have two candidate estimators for survey quality based on the score column over 2025-08-26 to 2025-09-01: - E_first: For each user×survey pair, take the first score in-window; then average across pairs. - E_all: Average across all in-window scores (users with more responses contribute more weight). Answer precisely: 1) Define the target estimand and write E_first and E_all formally. Show that E_all is a weighted average of per-user means with weights proportional to each user’s in-window response count. Under what conditions are E_first and E_all equal in expectation? 2) Discuss bias risks for E_all when the number of responses per user correlates with their latent satisfaction. Provide a concrete example where E_all over- or under-estimates the estimand. 3) Derive large-sample variance estimators for both approaches. For E_all, propose user-cluster-robust standard errors; for E_first, standard IID SEs over user×survey pairs. Explain when cluster-robust SEs are required for E_first. 4) Suppose repeated scores within a user×survey pair follow an exchangeable correlation with intracluster correlation ρ and average cluster size m. Show how the design effect DE = 1 + (m−1)ρ affects effective sample size and confidence intervals for E_all. 5) Recommend which estimator to report by default and when to prefer the alternative. Justify using bias–variance tradeoffs and interpretability.

Quick Answer: This question evaluates statistical estimation and inference competencies—specifically understanding estimator definitions, weighting and sampling effects, bias risks when response counts correlate with latent outcomes, intracluster correlation, variance estimation and cluster-robust standard errors, and design-effect calculations.

Related Interview Questions

  • Compute probability an account is fake - Meta (easy)
  • Compute Bayes probability for fake accounts - Meta (easy)
  • Compute probabilities for chatbot response quality - Meta (easy)
  • Compute posterior fake probability using Bayes' rule - Meta (medium)
  • Estimate bots and CI from DAU spike - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Statistics & Math
2
0
Loading...

You have two candidate estimators for survey quality based on the score column over 2025-08-26 to 2025-09-01:

  • E_first: For each user×survey pair, take the first score in-window; then average across pairs.
  • E_all: Average across all in-window scores (users with more responses contribute more weight).

Answer precisely:

  1. Define the target estimand and write E_first and E_all formally. Show that E_all is a weighted average of per-user means with weights proportional to each user’s in-window response count. Under what conditions are E_first and E_all equal in expectation?
  2. Discuss bias risks for E_all when the number of responses per user correlates with their latent satisfaction. Provide a concrete example where E_all over- or under-estimates the estimand.
  3. Derive large-sample variance estimators for both approaches. For E_all, propose user-cluster-robust standard errors; for E_first, standard IID SEs over user×survey pairs. Explain when cluster-robust SEs are required for E_first.
  4. Suppose repeated scores within a user×survey pair follow an exchangeable correlation with intracluster correlation ρ and average cluster size m. Show how the design effect DE = 1 + (m−1)ρ affects effective sample size and confidence intervals for E_all.
  5. Recommend which estimator to report by default and when to prefer the alternative. Justify using bias–variance tradeoffs and interpretability.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Statistics & Math•More Meta•More Data Scientist•Meta Data Scientist•Meta Statistics & Math•Data Scientist Statistics & Math
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.