How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a Medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Google.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Google during technical interviews.

Simulate Coin Flips to Determine Fairness via Empirical Distribution

Quick Overview

This question evaluates a candidate's ability to perform simulation-based hypothesis testing by implementing Monte Carlo sampling and computing an empirical two-sided p-value from observed coin-flip counts, emphasizing statistical computation and numerical accuracy.

Company: Google

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

##### Scenario You must test whether a coin is fair by simulation. ##### Question Write code that repeatedly simulates n coin flips, records the number of heads, and uses the empirical distribution to compute the two-sided p-value for an observed count of heads without relying on closed-form formulas. ##### Hints Focus on efficient Monte Carlo loops or vectorized sampling and correct p-value definition.

Quick Answer: This question evaluates a candidate's ability to perform simulation-based hypothesis testing by implementing Monte Carlo sampling and computing an empirical two-sided p-value from observed coin-flip counts, emphasizing statistical computation and numerical accuracy.

You are testing whether a coin is fair using simulation rather than any closed-form formula. You have already run a Monte Carlo experiment: you simulated many independent trials, where each trial flips a fair coin `n` times and records the number of heads. The list `simulated_counts` holds the head count from each of those simulated trials — this is your empirical (null) distribution. Given this empirical distribution, an `observed` head count from a real experiment of the same `n` flips, and `n` itself, compute the **two-sided p-value**: the fraction of simulated trials that are *at least as extreme* as the observed count. Extremeness is measured by the absolute deviation from the expected mean `n/2` (since a fair coin is expected to land heads `n/2` times). A simulated count `c` is at least as extreme as the observed count when `abs(c - n/2) >= abs(observed - n/2)`. Return the p-value as a float in `[0.0, 1.0]`. If `simulated_counts` is empty, return `0.0`. Note: the actual flip simulation that produces `simulated_counts` is stochastic; this function computes the deterministic p-value from a given empirical distribution, which is the part graded for correctness.

Constraints

0 <= len(simulated_counts) <= 10^6
0 <= simulated_counts[i] <= n
0 <= observed <= n
1 <= n
Return 0.0 when simulated_counts is empty.
Two-sided extremeness is measured by absolute deviation from the mean n/2; use >= so the observed value's own deviation counts as extreme.

Examples

Input: ([4, 5, 6, 5, 4, 6, 3, 7, 5, 5], 8, 10)

Expected Output: 0.0

Explanation: mean=5, observed deviation=|8-5|=3. No simulated count deviates by 3 or more (max deviation here is 2), so the p-value is 0/10 = 0.0.

Input: ([5, 5, 5, 5], 5, 10)

Expected Output: 1.0

Explanation: observed=5 equals the mean, so observed deviation=0. Every count's deviation (0) is >= 0, giving 4/4 = 1.0.

Input: ([0, 1, 9, 10, 5, 5, 6, 4, 7, 3], 10, 10)

Expected Output: 0.2

Explanation: mean=5, observed deviation=|10-5|=5. Only the counts 0 (dev 5) and 10 (dev 5) reach deviation 5, so 2/10 = 0.2.

Input: ([], 5, 10)

Expected Output: 0.0

Explanation: Empty empirical distribution: by definition the function returns 0.0.

Input: ([50], 50, 100)

Expected Output: 1.0

Explanation: Single trial, observed=mean=50 so observed deviation=0; the one count's deviation (0) is >= 0, giving 1/1 = 1.0.

Input: ([2, 8, 5, 5, 1, 9], 0, 10)

Expected Output: 0.0

Explanation: observed=0 is maximally extreme (deviation 5). No simulated count reaches deviation 5 (max is |1-5|=|9-5|=4), so 0/6 = 0.0 — the observed result was more extreme than anything simulated.

Input: ([3, 7, 4, 6, 5, 5, 2, 8], 7, 10)

Expected Output: 0.5

Explanation: mean=5, observed deviation=|7-5|=2. Counts with deviation>=2: 3,7 (dev 2), 2,8 (dev 3) = 4 of 8, giving 0.5.

Hints

The expected number of heads for a fair coin is n/2. 'Extreme' means far from this mean in EITHER direction, so measure abs(count - n/2).
The two-sided p-value is the proportion of simulated trials whose deviation from n/2 is greater than OR EQUAL to the observed deviation.
Use >= (not >) at the boundary so that simulated trials exactly as extreme as the observed count are included.
A single linear scan over simulated_counts is enough — no sorting or closed-form binomial formula is needed.

Quick Overview