Design metrics and experiment for Shopping launch
Company: Pinterest
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
You are launching a new Shopping module embedded in the Pins feed. Design an experiment and metric plan that: (1) Chooses one primary success metric and 3–5 guardrail metrics. Define each precisely (numerator/denominator, unit-of-analysis, aggregation window) and include at least one intermediate/funnel metric (e.g., num_clicks_of_new_feature, stay_time_in_shopping). Discuss pros/cons of using DAU and user time spent as primaries vs. alternatives (e.g., Shopping CTR, Add-to-Cart Rate, GMV/DAU), and specify acceptable directions/magnitudes for guardrails. (2) Handles spillover/interference (repins/shares may expose control users) and learning/novelty effects. Propose and justify a concrete design (pick one): user-level randomization with exposure-logging and adjacency tests; cluster/geo randomization; switchback (time-based) or two-stage saturation design. Detail: randomization unit, eligibility/exposure rules, cooldown, novelty burn-in, and how you would detect/quantify spillover (e.g., graph distance, household/geo adjacency) and learning (e.g., time-on-feature slope). (3) Specifies power and MDE: assumptions on baseline rates, variance, intra-cluster correlation if clustered, horizon length, and how you’ll handle seasonality and peaky traffic (weekly cycles). Include an AA test and CUPED (or covariate adjustment) plan. (4) Defines a decision framework when other metrics drop: Suppose after a 21-day test you observe +2.3% lift in Shopping CTR, +1.1% in GMV/user, but −0.6% in overall time spent and −0.2% in DAU. Describe your net weighted lift (or multi-objective) rubric, guardrail thresholds, sensitivity to long-term effects, and what you would recommend to the PM. Include what additional diagnostics you’d run before a rollout (e.g., user/creator segment heterogeneity, cannibalization of ad revenue, repeat usage vs. one-off novelty).
Quick Answer: This question evaluates a data scientist’s competency in experimental design and product analytics, including metric selection and precise definitions, guardrail and funnel diagnostics, spillover and novelty handling, power/MDE estimation, and multi-objective decision framing.