CoinFactory ran a 60-second Super Bowl TV spot on 2025-02-09 with a QR code to a signup page; successful sign-ups receive a $15 coupon. You must estimate incremental sign-ups attributable to the ad in the first 48 hours and quantify uncertainty. Provide a concrete plan that includes: 1) Identification: propose at least two distinct methods (e.g., high-frequency time-series counterfactual with synthetic control, geo-lift/DiD with control DMAs, calibrated MMM short-horizon attribution). State the identifying assumptions explicitly. 2) Data you would use: minute-level traffic/sign-ups for the prior 8 comparable Sundays, QR UTM-tagged sessions and device/IP de-dup rules, coupon issuance/redemption logs (coupon_id, user_id, issued_at, redeemed_at), TV air-times/GRPs by DMA, press mentions timestamps, bot-filtering heuristics, app store ranking changes, and site latency/error logs. 3) De-duplication and leakage: handle multi-device scans, dark social reshares of the QR URL, bots, and post-game press coverage spillover. Explain how you’ll separate organic baseline from paid lift and how to attribute delayed sign-ups within the 48h window. 4) Back-of-the-envelope (compute): Suppose the logs show 12,000,000 QR scans, 40% remain after de-dup, landing→signup conversion is 22%, baseline is 50,000 sign-ups/day (absent the ad), and press coverage added an 8% lift to the baseline for the first 24h. A competitor ran a similar QR ad in 8 DMAs that constitute 12% of our reach and cannibalized 25% of our QR traffic there. Estimate incremental sign-ups and provide a 90% CI using a reasonable variance model; show each adjustment step (baseline subtraction, cannibalization, spillover). 5) Validation: cross-check with coupon redemptions (assume 20% redeem within 7 days) and with geo heterogeneity. Describe how you’d reconcile differences across the methods and decide on the final estimate.

This question evaluates a data scientist's proficiency in causal inference and attribution, high-frequency time-series and geo-experimental design, event-level instrumentation and de-duplication, and statistical uncertainty quantification.

How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Coinbase.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Coinbase during technical interviews.

Estimate Super Bowl QR ad sign-ups | Coinbase Interview Question

Incremental Sign-ups From a Super Bowl QR Ad (48h)

CoinFactory ran a 60-second Super Bowl TV spot on 2025-02-09 with a QR code to a signup page. Successful sign-ups receive a $15 coupon. Estimate incremental sign-ups attributable to the ad in the first 48 hours and quantify uncertainty.

Provide a concrete plan that includes:

Identification

Propose at least two distinct causal identification strategies (e.g., high-frequency time-series counterfactual with synthetic control, geo-lift/DiD with control DMAs, calibrated MMM over a short horizon).
State the identifying assumptions explicitly for each method.

Data You Would Use

Minute-level traffic/sign-ups for the prior 8 comparable Sundays.
QR UTM-tagged sessions and device/IP de-duplication rules.
Coupon issuance/redemption logs (coupon_id, user_id, issued_at, redeemed_at).
TV air-times/GRPs by DMA.
Press mentions timestamps.
Bot-filtering heuristics.
App store ranking changes.
Site latency/error logs.

De-duplication and Leakage

Handle multi-device scans, dark social reshares of the QR URL, bots, and post-game press coverage spillover.
Explain how you’ll separate organic baseline from paid lift and how you’ll attribute delayed sign-ups within the 48-hour window.

Back-of-the-Envelope (Compute) Assume:

12,000,000 QR scans in logs.
40% remain after de-duplication.
Landing→signup conversion = 22%.
Baseline = 50,000 sign-ups/day (absent the ad).
Press coverage added an 8% lift to baseline for the first 24 hours.
A competitor ran a similar QR ad in 8 DMAs that represent 12% of our reach and cannibalized 25% of our QR traffic there.

Estimate incremental sign-ups and provide a 90% CI using a reasonable variance model; show each adjustment step (baseline subtraction, cannibalization, spillover).

Validation

Cross-check with coupon redemptions (assume 20% redeem within 7 days) and with geo heterogeneity.
Describe how to reconcile differences across methods and decide on the final estimate.

Incremental Sign-ups From a Super Bowl QR Ad (48h)

Provide a concrete plan that includes:

Identification

Propose at least two distinct causal identification strategies (e.g., high-frequency time-series counterfactual with synthetic control, geo-lift/DiD with control DMAs, calibrated MMM over a short horizon).
State the identifying assumptions explicitly for each method.

Data You Would Use

Minute-level traffic/sign-ups for the prior 8 comparable Sundays.
QR UTM-tagged sessions and device/IP de-duplication rules.
Coupon issuance/redemption logs (coupon_id, user_id, issued_at, redeemed_at).
TV air-times/GRPs by DMA.
Press mentions timestamps.
Bot-filtering heuristics.
App store ranking changes.
Site latency/error logs.

De-duplication and Leakage

Handle multi-device scans, dark social reshares of the QR URL, bots, and post-game press coverage spillover.
Explain how you’ll separate organic baseline from paid lift and how you’ll attribute delayed sign-ups within the 48-hour window.

Back-of-the-Envelope (Compute) Assume:

12,000,000 QR scans in logs.
40% remain after de-duplication.
Landing→signup conversion = 22%.
Baseline = 50,000 sign-ups/day (absent the ad).
Press coverage added an 8% lift to baseline for the first 24 hours.
A competitor ran a similar QR ad in 8 DMAs that represent 12% of our reach and cannibalized 25% of our QR traffic there.

Estimate incremental sign-ups and provide a 90% CI using a reasonable variance model; show each adjustment step (baseline subtraction, cannibalization, spillover).

Validation

Cross-check with coupon redemptions (assume 20% redeem within 7 days) and with geo heterogeneity.
Describe how to reconcile differences across methods and decide on the final estimate.

Estimate Super Bowl QR ad sign-ups

Quick Overview

Estimate Super Bowl QR ad sign-ups

Incremental Sign-ups From a Super Bowl QR Ad (48h)

Write your answer

Estimate Super Bowl QR ad sign-ups

Quick Overview

Estimate Super Bowl QR ad sign-ups

Incremental Sign-ups From a Super Bowl QR Ad (48h)

Write your answer