You computed correlations for sales outreach analysis. Answer the following with formulas and numerical results where possible. (a) For n = 3200 deals, Pearson r between call_count in the first 14 days post-creation and is_won is 0.23. Using Fisher's z-transform, compute the 95% CI for r and the two-sided p-value. Show intermediate z, SE, and back-transform steps. (b) You tested m = 24 correlations (different channels/time windows). The sorted p-values are: [0.0004, 0.0010, 0.0040, 0.0090, 0.0120, 0.0190, 0.0260, 0.0310, 0.0410, 0.0530, 0.0610, 0.0740, 0.0810, 0.0940, 0.1100, 0.1300, 0.1700, 0.2100, 0.2700, 0.3400, 0.4100, 0.5500, 0.6800, 0.7900]. Apply Benjamini–Hochberg at q = 0.10 and state which hypotheses you reject, showing the threshold comparison i*(q/m). (c) What is the minimal detectable correlation (two-sided, alpha = 0.05, power = 0.80) for n = 500 using Fisher's z power approximation? Provide the formula and numeric answer. (d) You observe overall corr(discount_rate, is_won) = -0.10, but within each region {East, West, Central} the correlations are {+0.05, +0.04, +0.03}. Explain, with equations, how region-mix imbalance can yield this Simpson’s paradox and how to diagnose it numerically (e.g., weighted covariance decomposition and partial correlation controlling for region).

This question evaluates statistical inference for correlations, multiple testing control (false discovery rate), power and sample-size calculations, and reasoning about confounding and Simpson’s paradox using covariance and partial-correlation concepts.

How do I approach Statistics & Math interview questions?

Statistics & Math questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master statistics & math interviews.

What difficulty level is this interview question?

This is a medium difficulty Statistics & Math question, commonly asked during Technical Screen rounds at Meta.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Compute and correct correlation significance inflation

Sales Outreach Correlation Analysis: Inference, Multiple Testing, Power, and Simpson’s Paradox

Context

You are analyzing sales data to understand relationships between outreach actions and deal outcomes. Below, compute inferential statistics for a correlation, control the false discovery rate across multiple tests, estimate detectable effect size for a study design, and explain a Simpson’s paradox scenario using equations.

Tasks

(a) For n = 3200 deals, the Pearson correlation between call_count in the first 14 days and is_won is r = 0.23. Using Fisher's z-transform, compute the 95% confidence interval for r and the two-sided p-value. Show intermediate Fisher z, standard error (SE), z-interval, and back-transform steps.

(b) You tested m = 24 correlations (different channels/time windows). Sorted p-values are: [0.0004, 0.0010, 0.0040, 0.0090, 0.0120, 0.0190, 0.0260, 0.0310, 0.0410, 0.0530, 0.0610, 0.0740, 0.0810, 0.0940, 0.1100, 0.1300, 0.1700, 0.2100, 0.2700, 0.3400, 0.4100, 0.5500, 0.6800, 0.7900]. Apply the Benjamini–Hochberg procedure at q = 0.10 and state which hypotheses you reject, showing the thresholds i × (q/m).

(c) What is the minimal detectable correlation (two-sided, α = 0.05, power = 0.80) for n = 500 using the Fisher z power approximation? Provide the formula and numeric result.

(d) You observe overall corr(discount_rate, is_won) = −0.10, but within each region {East, West, Central} the correlations are {+0.05, +0.04, +0.03}. Explain, with equations, how region-mix imbalance can yield this Simpson’s paradox and how to diagnose it numerically (e.g., weighted covariance decomposition and partial correlation controlling for region).

Context

Tasks

(c) What is the minimal detectable correlation (two-sided, α = 0.05, power = 0.80) for n = 500 using the Fisher z power approximation? Provide the formula and numeric result.

Compute and correct correlation significance inflation

Quick Overview

Sales Outreach Correlation Analysis: Inference, Multiple Testing, Power, and Simpson’s Paradox

Context

Tasks

Solution

Submit Your Answer to Earn 20XP

Compute and correct correlation significance inflation

Quick Overview

Sales Outreach Correlation Analysis: Inference, Multiple Testing, Power, and Simpson’s Paradox

Context

Tasks

Solution

Submit Your Answer to Earn 20XP