PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Analytics & Experimentation/Meta

Design cluster-randomized test under network effects

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's experiment design and causal inference skills under interference, covering exposure and estimand definition, graph-based cluster construction, cluster-level randomization and contamination mitigation, metric selection, and design-effect/sample-size calculations.

  • hard
  • Meta
  • Analytics & Experimentation
  • Data Scientist

Design cluster-randomized test under network effects

Company: Meta

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Technical Screen

Design an A/B test for a new Group Call feature where network effects and interference are expected. Requirements: (a) Define exposure and the estimand (e.g., ITT at the cluster level vs per-user ATE under interference). (b) Construct clusters from the social graph (describe edge weighting for 'strong ties', the clustering rule—e.g., connected components after thresholding—and how to prevent cluster overlap). (c) Randomize at the cluster level and explain how to handle cross-cluster edges (frontier users), including holdout buffers or partial saturation. (d) Choose a primary metric and two guardrails; critically assess 'time spent per user per day' as a success metric (pros, cons, manipulation risks), and propose at least one alternative primary metric. (e) Compute the design effect and sample-size inflation for cluster randomization: if average cluster size m=20 and intracluster correlation ICC=0.05, what is the design effect and how does it change if m doubles but ICC halves? (f) If you ignore network effects and randomize by user, under what conditions will the naïve estimate be biased downward vs upward? Provide intuition for positive vs negative spillovers and give one real-world case for each direction.

Quick Answer: This question evaluates a data scientist's experiment design and causal inference skills under interference, covering exposure and estimand definition, graph-based cluster construction, cluster-level randomization and contamination mitigation, metric selection, and design-effect/sample-size calculations.

Related Interview Questions

  • Measure scheduled posts feature success - Meta (medium)
  • Estimate ads ranking revenue impact - Meta (medium)
  • How should you evaluate unconnected content? - Meta (medium)
  • Should WhatsApp launch group calls? - Meta (medium)
  • How would you grow Meta products? - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Analytics & Experimentation
4
0

A/B Test Design for a New Group Call Feature with Network Effects

You are designing an experiment for a Group Call feature where social network effects and interference are expected. Assume a social graph between users, and that the feature is delivered at the user level but can be constrained by design. Address the following:

(a) Define exposure and the estimand(s):

  • Clearly define user exposure under interference (e.g., own treatment plus share of treated neighbors).
  • Specify the primary estimand: intent-to-treat (ITT) at the cluster level vs a per-user average treatment effect (ATE) under interference. Be explicit about the population and exposure mapping.

(b) Construct clusters from the social graph:

  • Describe how to build edge weights for "strong ties" (which signals to include and how to combine them).
  • Propose a clustering rule (e.g., threshold strong ties then take connected components, or community detection) that yields disjoint clusters of reasonable size.
  • Explain how you will prevent cluster overlap and cap cluster size.

(c) Randomize at the cluster level and handle cross-cluster edges:

  • Describe the randomization scheme (stratification, balance criteria).
  • Define "frontier" users (those with cross-cluster edges) and how you’ll mitigate contamination (e.g., holdout buffers, gating cross-cluster invitations, or partial saturation). Clarify analysis vs exclusion rules for frontier users.

(d) Choose metrics:

  • Pick one primary metric and two guardrail metrics.
  • Critically assess "time spent per user per day" as a success metric (pros, cons, manipulation risks).
  • Propose at least one viable alternative primary metric with a rationale.

(e) Compute design effect (DE) and sample-size inflation for cluster randomization:

  • Given average cluster size m = 20 and intracluster correlation ICC = 0.05, compute DE.
  • Recompute DE if m doubles but ICC halves.
  • Explain implications for required sample size.

(f) Bias from ignoring network effects:

  • If you randomize by user and ignore interference, under what conditions is the naïve difference-in-means biased downward vs upward?
  • Provide intuition for positive vs negative spillovers and one real-world example for each direction.

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Meta•More Data Scientist•Meta Data Scientist•Meta Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.