Diagnose sales correlations without claiming causality

Q: Diagnose sales correlations without claiming causality

This question evaluates a data scientist's competency in designing correlation-focused observational analyses, including exposure-window definition, confounding control, within-group differencing, bias identification, sensitivity checks, and communication of non-causal findings.

Q: How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

Q: What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Meta.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Question

Loading...

Correlation-Focused Analysis: Outreach Channels vs. Deal Win Rate

You support a sales team and are asked to find which outreach channels correlate with higher deal win rate, without building predictive models. You have two datasets:

Deals: deal_id, account_id, rep_id, created_at, closed_at, is_won, amount_usd, product_line, region
Touches: account_id, rep_id, touch_date, channel (email/call/demo/webinar), is_primary_contact

Assume you have a frozen data snapshot date T0 (the last day touches and deals are observed). Design a decision-ready, correlation-focused analysis that avoids causal claims:

(a) Define a defensible exposure window (e.g., touches within the first 14 days after created_at) and justify how you’ll handle right-censoring for open deals and late touches.

(b) Specify stratifications and/or matching (e.g., region, segment, deal size buckets, rep tenure) to control confounding without modeling.

(c) Show exactly how you’d compute within-rep, within-segment correlations to avoid between-rep composition bias. Outline a de-meaning or fixed-effects-style differencing before correlating.

(d) List bias risks (reverse causality when hot deals drive more touches, missing-not-at-random touches on lost deals, seasonality) and propose sensitivity checks (pre-registration of windows, placebo windows before deal creation, leave-one-rep-out analysis, randomization inference) to assess robustness.

(e) Describe two plots that can reveal Simpson’s paradox across regions or segments and how you’d detect and communicate it.

(f) Write the exact decision guardrails you’ll present to sales leadership to prevent causal overreach and how you’d phrase them.

Diagnose sales correlations without claiming causality

Quick Overview

Correlation-Focused Analysis: Outreach Channels vs. Deal Win Rate

Solution

Comments (0)