Simulate Coin Flips to Determine Fairness via Empirical Distribution
Company: Google
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to perform simulation-based hypothesis testing by implementing Monte Carlo sampling and computing an empirical two-sided p-value from observed coin-flip counts, emphasizing statistical computation and numerical accuracy.
Constraints
- 0 <= len(simulated_counts) <= 10^6
- 0 <= simulated_counts[i] <= n
- 0 <= observed <= n
- 1 <= n
- Return 0.0 when simulated_counts is empty.
- Two-sided extremeness is measured by absolute deviation from the mean n/2; use >= so the observed value's own deviation counts as extreme.
Examples
Input: ([4, 5, 6, 5, 4, 6, 3, 7, 5, 5], 8, 10)
Expected Output: 0.0
Explanation: mean=5, observed deviation=|8-5|=3. No simulated count deviates by 3 or more (max deviation here is 2), so the p-value is 0/10 = 0.0.
Input: ([5, 5, 5, 5], 5, 10)
Expected Output: 1.0
Explanation: observed=5 equals the mean, so observed deviation=0. Every count's deviation (0) is >= 0, giving 4/4 = 1.0.
Input: ([0, 1, 9, 10, 5, 5, 6, 4, 7, 3], 10, 10)
Expected Output: 0.2
Explanation: mean=5, observed deviation=|10-5|=5. Only the counts 0 (dev 5) and 10 (dev 5) reach deviation 5, so 2/10 = 0.2.
Input: ([], 5, 10)
Expected Output: 0.0
Explanation: Empty empirical distribution: by definition the function returns 0.0.
Input: ([50], 50, 100)
Expected Output: 1.0
Explanation: Single trial, observed=mean=50 so observed deviation=0; the one count's deviation (0) is >= 0, giving 1/1 = 1.0.
Input: ([2, 8, 5, 5, 1, 9], 0, 10)
Expected Output: 0.0
Explanation: observed=0 is maximally extreme (deviation 5). No simulated count reaches deviation 5 (max is |1-5|=|9-5|=4), so 0/6 = 0.0 — the observed result was more extreme than anything simulated.
Input: ([3, 7, 4, 6, 5, 5, 2, 8], 7, 10)
Expected Output: 0.5
Explanation: mean=5, observed deviation=|7-5|=2. Counts with deviation>=2: 3,7 (dev 2), 2,8 (dev 3) = 4 of 8, giving 0.5.
Hints
- The expected number of heads for a fair coin is n/2. 'Extreme' means far from this mean in EITHER direction, so measure abs(count - n/2).
- The two-sided p-value is the proportion of simulated trials whose deviation from n/2 is greater than OR EQUAL to the observed deviation.
- Use >= (not >) at the boundary so that simulated trials exactly as extreme as the observed count are included.
- A single linear scan over simulated_counts is enough — no sorting or closed-form binomial formula is needed.