PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question tests a candidate's ability to implement weighted random sampling using prefix sums and binary search, evaluating practical algorithm design for probability distributions. It assesses proficiency in translating mathematical probability constraints into efficient data structures, a skill commonly probed in machine learning engineering roles where non-uniform sampling is fundamental to model training and inference pipelines.

  • medium
  • Tubitv
  • Coding & Algorithms
  • Machine Learning Engineer

Weighted Random Sampling by Index

Company: Tubitv

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

# Weighted Random Sampling by Index You are given an array of positive integer weights `weights`, where `weights[i]` is the weight assigned to index `i`. Implement a sampler that, on each call, returns a random index `i` with probability proportional to `weights[i]`. Concretely, build a class `WeightedSampler`: - The constructor `WeightedSampler(weights)` receives the array of weights and may do `O(n)` preprocessing. - The method `pick_index()` returns a single random integer in the range `[0, n - 1]`. The probability of returning index `i` must equal `weights[i] / sum(weights)`. For example, with `weights = [1, 3]`, `pick_index()` should return `0` about 25% of the time and `1` about 75% of the time. With `weights = [1, 2, 3, 4]`, index `3` should be returned about 40% of the time. Aim for each `pick_index()` call to run in `O(log n)` time after `O(n)` construction. ### Constraints & Assumptions - `1 <= n <= 10^5`, where `n = len(weights)`. - `1 <= weights[i] <= 10^5`. - The total sum of weights fits in a 64-bit integer. - `pick_index()` may be called up to `10^4` times. - You have access to a uniform random source over `[0, 1)` (or uniform integers over a range). ### Output / Expected Behavior - The constructor performs preprocessing only and returns nothing. - Each `pick_index()` call returns one integer index. Correctness is judged statistically: over many calls, the empirical frequency of each index must converge to `weights[i] / sum(weights)`.

Quick Answer: This question tests a candidate's ability to implement weighted random sampling using prefix sums and binary search, evaluating practical algorithm design for probability distributions. It assesses proficiency in translating mathematical probability constraints into efficient data structures, a skill commonly probed in machine learning engineering roles where non-uniform sampling is fundamental to model training and inference pipelines.

Weighted random sampling (LeetCode 528) returns an index `i` with probability `weights[i] / sum(weights)`. The standard `O(log n)`-per-pick approach is: precompute the prefix (cumulative) sums of `weights`, draw a uniform random `target` in `[0, sum(weights))`, then binary-search for the index of the bucket that `target` falls into. The randomness itself can't be unit-tested, so this challenge isolates the deterministic core that the lookup depends on — the part where a binary search bug silently breaks the distribution. Implement: ``` solution(weights, target) ``` Given an array of positive integer `weights` and an integer `target` with `0 <= target < sum(weights)`, return the index `i` of the bucket containing `target`. Formally, with prefix sums `P[i] = weights[0] + ... + weights[i]`, return the smallest `i` such that `P[i] > target` (equivalently, the unique `i` with `P[i-1] <= target < P[i]`, treating `P[-1] = 0`). This is the value a correct `pick_index()` would return for that drawn `target`. Aim for `O(n)` preprocessing of the prefix array and `O(log n)` per lookup. Example: `weights = [1, 2, 3, 4]` gives prefix sums `[1, 3, 6, 10]`. `target = 6` lands in the last bucket `[6, 10)`, so the answer is `3`. `target = 3` lands in the bucket `[3, 6)` (index 2), so the answer is `2`.

Constraints

  • 1 <= n = len(weights) <= 10^5
  • 1 <= weights[i] <= 10^5
  • 0 <= target < sum(weights)
  • The total sum of weights fits in a 64-bit integer.

Examples

Input: ([1, 3], 0)

Expected Output: 0

Explanation: Prefix sums [1, 4]. target=0 < 1, so it lands in bucket 0 (range [0,1)).

Input: ([1, 3], 1)

Expected Output: 1

Explanation: Prefix sums [1, 4]. target=1 is the start of bucket 1 (range [1,4)), so index 1.

Input: ([1, 3], 3)

Expected Output: 1

Explanation: target=3 is still within bucket 1's range [1,4), so index 1. (sum is 4, valid targets are 0..3.)

Input: ([1, 2, 3, 4], 0)

Expected Output: 0

Explanation: Prefix sums [1,3,6,10]. target=0 is in bucket 0 (range [0,1)).

Input: ([1, 2, 3, 4], 2)

Expected Output: 1

Explanation: target=2 falls in bucket 1's range [1,3), so index 1.

Input: ([1, 2, 3, 4], 3)

Expected Output: 2

Explanation: target=3 is the start of bucket 2's range [3,6) — the boundary case the <= comparison must handle, so index 2.

Input: ([1, 2, 3, 4], 6)

Expected Output: 3

Explanation: target=6 is the start of bucket 3's range [6,10), so index 3 (which should be picked ~40% of the time).

Input: ([1, 2, 3, 4], 9)

Expected Output: 3

Explanation: target=9 is the largest valid target (sum=10), still in bucket 3's range [6,10), so index 3.

Input: ([5], 0)

Expected Output: 0

Explanation: Single weight: every target maps to index 0.

Input: ([5], 4)

Expected Output: 0

Explanation: Single weight, target=4 < 5: still index 0. Exercises the lo==hi base case of the binary search.

Input: ([2, 2, 2], 0)

Expected Output: 0

Explanation: Equal weights, prefix [2,4,6]. target=0 in bucket 0 ([0,2)).

Input: ([2, 2, 2], 2)

Expected Output: 1

Explanation: target=2 is the boundary into bucket 1 ([2,4)), so index 1.

Input: ([2, 2, 2], 5)

Expected Output: 2

Explanation: target=5 in bucket 2 ([4,6)), so index 2.

Hints

  1. Precompute prefix sums P where P[i] = weights[0] + ... + weights[i]. The index i 'owns' the half-open range [P[i-1], P[i]) of target values (with P[-1] = 0).
  2. You want the smallest index i with P[i] > target. This is a lower_bound / first-true binary search, not an exact-match search — the classic off-by-one trap is using `<` vs `<=` in the comparison.
  3. Use the invariant `while lo < hi: mid = (lo+hi)//2; if P[mid] <= target: lo = mid+1 else: hi = mid`, then return lo. This converges lo to the first bucket strictly above target without needing a separate found/not-found case.
Last updated: Jun 24, 2026

Related Coding Questions

  • Implement K-Means and compare with GMM - Tubitv (medium)

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.