Probability, Weighted Sampling, And Random Walks
Asked of: Machine Learning Engineer
Last updated
What's being tested
These problems test discrete probability reasoning (hitting/absorbing probabilities in 1D random walks) and practical weighted sampling implementations for both batch and streaming settings. Interviewers probe whether you can convert probabilistic recurrences to computable algorithms and choose numerically stable, performance-appropriate samplers for production ML pipelines.
Patterns & templates
-
Gambler's ruin / linear recurrence — write recurrence for hitting probability, solve closed-form when step distribution simple; reduce to tridiagonal linear system for general neighbors.
-
Martingale / optional stopping — use expected value invariants to derive absorption probabilities when applicable.
-
Prefix-sum CDF + binary search — build prefix sums
O(n)and sample withO(log n)per draw viabisect; watch inclusive/exclusive indices. -
Alias method — Alias method gives
O(1)draws afterO(n)preprocessing; memoryO(n)and numerically sensitive to tiny weights. -
Weighted reservoir (A-Chao) — streaming weighted sampling maintenance in one pass,
O(1)per item amortized, handles unbounded streams. -
Numerical stability — normalize using sums in double or operate in log-space for extreme weights; avoid subtractive cancellation.
-
Determinism — seed local RNGs (
random.Random(seed)) for reproducible generators; avoid globalrandom.seed()in multi-threaded services.
Common pitfalls
Pitfall: assuming finite state or symmetric steps without asking — leads to wrong closed-form for asymmetric dice or unbounded walks.
Pitfall: building CDF with floats and then using
random.random()without normalization — can sample out-of-range for tiny/huge totals.
Pitfall: returning first non-zero weight when all weights zero or negative — always validate and define behavior for zero/negative weights.
Practice these
the practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
Related concepts
- Randomized Sampling AlgorithmsCoding & Algorithms
- Statistical Inference, Regression, And ProbabilityStatistics & Math
- Probability, Bayes, And Base Rates
- Queueing Theory, Probability, And DistributionsStatistics & Math
- Central Limit Theorem, Sampling, And Heavy-Tailed Metrics
- Statistical Inference, Power, And Confidence IntervalsStatistics & Math