PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Analytics & Experimentation/Netflix

Design and power a frequency-cap experiment

Last updated: Mar 29, 2026

Quick Overview

This question evaluates experimental design and causal inference skills, including power analysis, clustering and interference mitigation, metric engineering, sequential monitoring, and diagnostic interpretation for large-scale ad experiments.

  • hard
  • Netflix
  • Analytics & Experimentation
  • Data Scientist

Design and power a frequency-cap experiment

Company: Netflix

Role: Data Scientist

Category: Analytics & Experimentation

Difficulty: hard

Interview Round: Onsite

A product team wants to raise the per-user rolling 7-day frequency cap for a large video ad campaign from 3 to 4 impressions. Design an experiment and provide power calculations that account for interference and clustering. Context and requirements: - Population: US users eligible for the campaign; expected 4,000,000 eligible users/day over the test. - Randomization candidates: user_id, household_id, or geo cell; average household size among eligible users is m = 1.3; household-level ICC for the primary metric is 0.02. - Primary metric: 7-day conversion rate per unique exposed user (any purchase within 7 days of first exposure), baseline p0 = 2.00%. - Guardrails: daily unique reach, average session watch time, complaint rate per 1,000 impressions. - Traffic allocation: 50% Treatment (cap=4), 50% Control (cap=3), planned duration 28 days, with 4 equally spaced interim looks (including final). - CUPED: pre-period 7-day metric available with R^2 = 0.35 to reduce variance. - Interference risks: auctions shared across campaigns, overlapping advertisers, cross-device households, and pacing controls. Tasks: (1) Choose the randomization unit and justify it with a causal diagram: specify where interference could occur and how your choice mitigates it; propose cross-campaign holdouts or ghost-bids if needed. (2) Define precise metric formulas (numerators/denominators, exposure semantics, attribution window, de-duplication across devices) and the data you would log to compute them unambiguously. (3) Compute the minimum per-arm sample size (unique users) to detect an absolute lift from 2.00% to 2.10% (Δ = +0.10 pp) with α = 0.05 (two-sided) and 1−β = 0.80 using a two-proportion z-test. Adjust for clustering via VIF = 1 + (m−1)·ICC, then adjust again for CUPED by multiplying variance by (1−R^2). Show the final effective sample size and discuss whether 28 days of traffic suffices. (4) Specify sequential monitoring using O’Brien–Fleming boundaries for 4 looks: give approximate nominal α at each look and describe the decision rules. (5) List at least three diagnostic checks (e.g., covariate balance on pre-period exposures, saturation by user quantile, auction pressure) and the exact plots you would produce. Explain how you would interpret each to decide whether to ship the higher cap.

Quick Answer: This question evaluates experimental design and causal inference skills, including power analysis, clustering and interference mitigation, metric engineering, sequential monitoring, and diagnostic interpretation for large-scale ad experiments.

Related Interview Questions

  • Estimate ATE of personalization on streaming - Netflix (medium)
  • Compute ITT, TOT, and LATE with noncompliance - Netflix (medium)
  • Estimate ATE, ITT, and TOT from experiment - Netflix (easy)
  • Plan and analyze a ranking A/B test - Netflix (hard)
  • Design experiment on culture memo emphasis - Netflix (medium)
Netflix logo
Netflix
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Analytics & Experimentation
9
0

Experiment Design: Raising a 7‑Day Frequency Cap from 3→4 Impressions

Context

A large video ad campaign plans to raise the per‑user rolling 7‑day frequency cap from 3 to 4 impressions. The goal is to estimate the causal impact on conversions while accounting for clustering and potential interference (auctions, cross‑device households, overlapping advertisers, pacing).

  • Population: US users eligible for the campaign; ~4,000,000 eligible users per day during test.
  • Randomization candidates: user_id, household_id, or geo cell.
    • Average household size among eligible users: m = 1.3.
    • Household-level ICC for the primary metric: 0.02.
  • Primary metric: 7‑day conversion rate per unique exposed user (any purchase within 7 days of first exposure), baseline p0 = 2.00%.
  • Guardrails: daily unique reach, average session watch time, complaint rate per 1,000 impressions.
  • Traffic allocation: 50% Treatment (cap = 4), 50% Control (cap = 3), duration 28 days, with 4 equally spaced interim looks (including final).
  • CUPED: pre‑period 7‑day metric available with R^2 = 0.35.
  • Interference risks: shared auctions across campaigns, overlapping advertisers, cross‑device households, pacing controls.

Tasks

  1. Choose the randomization unit and justify it with a causal diagram. Specify where interference could occur and how your choice mitigates it. Propose cross‑campaign holdouts or ghost‑bids if needed.
  2. Define precise metric formulas (numerators/denominators, exposure semantics, attribution window, de‑duplication across devices) and the data to log to compute them unambiguously.
  3. Compute the minimum per‑arm sample size (unique users) to detect an absolute lift from 2.00% to 2.10% (Δ = +0.10 pp) with α = 0.05 (two‑sided) and 1−β = 0.80 using a two‑proportion z‑test. Adjust for clustering via VIF = 1 + (m−1)·ICC, then adjust variance for CUPED by multiplying by (1−R^2). Show the final effective sample size and discuss whether 28 days of traffic suffices.
  4. Specify sequential monitoring using O’Brien–Fleming boundaries for 4 looks: give approximate nominal α at each look and describe the decision rules.
  5. List at least three diagnostic checks (e.g., covariate balance on pre‑period exposures, saturation by user quantile, auction pressure) and the exact plots you would produce. Explain how you would interpret each to decide whether to ship the higher cap.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Analytics & Experimentation•More Netflix•More Data Scientist•Netflix Data Scientist•Netflix Analytics & Experimentation•Data Scientist Analytics & Experimentation
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.