How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Tubitv.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Tubitv during technical interviews.

Weighted Random Sampling by Index | Tubitv Coding Question

Quick Overview

This question tests a candidate's ability to implement weighted random sampling using prefix sums and binary search, evaluating practical algorithm design for probability distributions. It assesses proficiency in translating mathematical probability constraints into efficient data structures, a skill commonly probed in machine learning engineering roles where non-uniform sampling is fundamental to model training and inference pipelines.

Weighted Random Sampling by Index

Company: Tubitv

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

# Weighted Random Sampling by Index You are given an array of positive integer weights `weights`, where `weights[i]` is the weight assigned to index `i`. Implement a sampler that, on each call, returns a random index `i` with probability proportional to `weights[i]`. Concretely, build a class `WeightedSampler`: - The constructor `WeightedSampler(weights)` receives the array of weights and may do `O(n)` preprocessing. - The method `pick_index()` returns a single random integer in the range `[0, n - 1]`. The probability of returning index `i` must equal `weights[i] / sum(weights)`. For example, with `weights = [1, 3]`, `pick_index()` should return `0` about 25% of the time and `1` about 75% of the time. With `weights = [1, 2, 3, 4]`, index `3` should be returned about 40% of the time. Aim for each `pick_index()` call to run in `O(log n)` time after `O(n)` construction. ### Constraints & Assumptions - `1 <= n <= 10^5`, where `n = len(weights)`. - `1 <= weights[i] <= 10^5`. - The total sum of weights fits in a 64-bit integer. - `pick_index()` may be called up to `10^4` times. - You have access to a uniform random source over `[0, 1)` (or uniform integers over a range). ### Output / Expected Behavior - The constructor performs preprocessing only and returns nothing. - Each `pick_index()` call returns one integer index. Correctness is judged statistically: over many calls, the empirical frequency of each index must converge to `weights[i] / sum(weights)`.

Quick Answer: This question tests a candidate's ability to implement weighted random sampling using prefix sums and binary search, evaluating practical algorithm design for probability distributions. It assesses proficiency in translating mathematical probability constraints into efficient data structures, a skill commonly probed in machine learning engineering roles where non-uniform sampling is fundamental to model training and inference pipelines.

Weighted random sampling (LeetCode 528) returns an index `i` with probability `weights[i] / sum(weights)`. The standard `O(log n)`-per-pick approach is: precompute the prefix (cumulative) sums of `weights`, draw a uniform random `target` in `[0, sum(weights))`, then binary-search for the index of the bucket that `target` falls into. The randomness itself can't be unit-tested, so this challenge isolates the deterministic core that the lookup depends on — the part where a binary search bug silently breaks the distribution. Implement: ``` solution(weights, target) ``` Given an array of positive integer `weights` and an integer `target` with `0 <= target < sum(weights)`, return the index `i` of the bucket containing `target`. Formally, with prefix sums `P[i] = weights[0] + ... + weights[i]`, return the smallest `i` such that `P[i] > target` (equivalently, the unique `i` with `P[i-1] <= target < P[i]`, treating `P[-1] = 0`). This is the value a correct `pick_index()` would return for that drawn `target`. Aim for `O(n)` preprocessing of the prefix array and `O(log n)` per lookup. Example: `weights = [1, 2, 3, 4]` gives prefix sums `[1, 3, 6, 10]`. `target = 6` lands in the last bucket `[6, 10)`, so the answer is `3`. `target = 3` lands in the bucket `[3, 6)` (index 2), so the answer is `2`.

Constraints

1 <= n = len(weights) <= 10^5
1 <= weights[i] <= 10^5
0 <= target < sum(weights)
The total sum of weights fits in a 64-bit integer.

Examples

Input: ([1, 3], 0)

Expected Output: 0

Explanation: Prefix sums [1, 4]. target=0 < 1, so it lands in bucket 0 (range [0,1)).

Input: ([1, 3], 1)

Expected Output: 1

Explanation: Prefix sums [1, 4]. target=1 is the start of bucket 1 (range [1,4)), so index 1.

Input: ([1, 3], 3)

Expected Output: 1

Explanation: target=3 is still within bucket 1's range [1,4), so index 1. (sum is 4, valid targets are 0..3.)

Input: ([1, 2, 3, 4], 0)

Expected Output: 0

Explanation: Prefix sums [1,3,6,10]. target=0 is in bucket 0 (range [0,1)).

Input: ([1, 2, 3, 4], 2)

Expected Output: 1

Explanation: target=2 falls in bucket 1's range [1,3), so index 1.

Input: ([1, 2, 3, 4], 3)

Expected Output: 2

Explanation: target=3 is the start of bucket 2's range [3,6) — the boundary case the <= comparison must handle, so index 2.

Input: ([1, 2, 3, 4], 6)

Expected Output: 3

Explanation: target=6 is the start of bucket 3's range [6,10), so index 3 (which should be picked ~40% of the time).

Input: ([1, 2, 3, 4], 9)

Expected Output: 3

Explanation: target=9 is the largest valid target (sum=10), still in bucket 3's range [6,10), so index 3.

Input: ([5], 0)

Expected Output: 0

Explanation: Single weight: every target maps to index 0.

Input: ([5], 4)

Expected Output: 0

Explanation: Single weight, target=4 < 5: still index 0. Exercises the lo==hi base case of the binary search.

Input: ([2, 2, 2], 0)

Expected Output: 0

Explanation: Equal weights, prefix [2,4,6]. target=0 in bucket 0 ([0,2)).

Input: ([2, 2, 2], 2)

Expected Output: 1

Explanation: target=2 is the boundary into bucket 1 ([2,4)), so index 1.

Input: ([2, 2, 2], 5)

Expected Output: 2

Explanation: target=5 in bucket 2 ([4,6)), so index 2.

Hints

Precompute prefix sums P where P[i] = weights[0] + ... + weights[i]. The index i 'owns' the half-open range [P[i-1], P[i]) of target values (with P[-1] = 0).
You want the smallest index i with P[i] > target. This is a lower_bound / first-true binary search, not an exact-match search — the classic off-by-one trap is using `<` vs `<=` in the comparison.
Use the invariant `while lo < hi: mid = (lo+hi)//2; if P[mid] <= target: lo = mid+1 else: hi = mid`, then return lo. This converges lo to the first bucket strictly above target without needing a separate found/not-found case.

Last updated: Jun 24, 2026

Loading coding console...

Quick Overview

Weighted Random Sampling by Index

Company: Tubitv

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Constraints

1 <= n = len(weights) <= 10^5
1 <= weights[i] <= 10^5
0 <= target < sum(weights)
The total sum of weights fits in a 64-bit integer.

Examples

Input: ([1, 3], 0)

Expected Output: 0

Explanation: Prefix sums [1, 4]. target=0 < 1, so it lands in bucket 0 (range [0,1)).

Input: ([1, 3], 1)

Expected Output: 1

Explanation: Prefix sums [1, 4]. target=1 is the start of bucket 1 (range [1,4)), so index 1.

Input: ([1, 3], 3)

Expected Output: 1

Explanation: target=3 is still within bucket 1's range [1,4), so index 1. (sum is 4, valid targets are 0..3.)

Input: ([1, 2, 3, 4], 0)

Expected Output: 0

Explanation: Prefix sums [1,3,6,10]. target=0 is in bucket 0 (range [0,1)).

Input: ([1, 2, 3, 4], 2)

Expected Output: 1

Explanation: target=2 falls in bucket 1's range [1,3), so index 1.

Input: ([1, 2, 3, 4], 3)

Expected Output: 2

Explanation: target=3 is the start of bucket 2's range [3,6) — the boundary case the <= comparison must handle, so index 2.

Input: ([1, 2, 3, 4], 6)

Expected Output: 3

Explanation: target=6 is the start of bucket 3's range [6,10), so index 3 (which should be picked ~40% of the time).

Input: ([1, 2, 3, 4], 9)

Expected Output: 3

Explanation: target=9 is the largest valid target (sum=10), still in bucket 3's range [6,10), so index 3.

Input: ([5], 0)

Expected Output: 0

Explanation: Single weight: every target maps to index 0.

Input: ([5], 4)

Expected Output: 0

Explanation: Single weight, target=4 < 5: still index 0. Exercises the lo==hi base case of the binary search.

Input: ([2, 2, 2], 0)

Expected Output: 0

Explanation: Equal weights, prefix [2,4,6]. target=0 in bucket 0 ([0,2)).

Input: ([2, 2, 2], 2)

Expected Output: 1

Explanation: target=2 is the boundary into bucket 1 ([2,4)), so index 1.

Input: ([2, 2, 2], 5)

Expected Output: 2

Explanation: target=5 in bucket 2 ([4,6)), so index 2.

Hints

Precompute prefix sums P where P[i] = weights[0] + ... + weights[i]. The index i 'owns' the half-open range [P[i-1], P[i]) of target values (with P[-1] = 0).
You want the smallest index i with P[i] > target. This is a lower_bound / first-true binary search, not an exact-match search — the classic off-by-one trap is using `<` vs `<=` in the comparison.
Use the invariant `while lo < hi: mid = (lo+hi)//2; if P[mid] <= target: lo = mid+1 else: hi = mid`, then return lo. This converges lo to the first bucket strictly above target without needing a separate found/not-found case.

Last updated: Jun 24, 2026

Loading coding console...