How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a easy difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Pinterest.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Pinterest during technical interviews.

Implement np.random.choice | Pinterest Coding Question

Quick Overview

This question evaluates understanding of random sampling algorithms, probability distributions (uniform and weighted), handling of weights and edge-case validation, and engineering trade-offs in data-structure-aware implementation for both integer ranges and explicit collections.

Implement np.random.choice

Company: Pinterest

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

Implement a function similar to `numpy.random.choice`. ## Function signature `choice(a, size=None, replace=True, p=None) -> samples` ## Requirements - `a` can be: - an integer `n` meaning sample from `[0, 1, ..., n-1]`, or - a list/array of items to sample from. - `size` can be: - `None` (return a single item), or - an integer `k` (return a list of length `k`). - `replace`: - if `True`, sampling is with replacement. - if `False`, sampling is without replacement (no duplicates). Assume `k <= len(a)`. - `p`: - optional probability weights aligned with `a`. - if `None`, use uniform distribution. ## Tasks 1. Describe an algorithm to sample **with replacement** (uniform and weighted). 2. Describe an algorithm to sample **without replacement** (uniform and weighted). 3. Discuss time complexity and edge cases (invalid `p`, floating point sum != 1, `size=None`, etc.).

Quick Answer: This question evaluates understanding of random sampling algorithms, probability distributions (uniform and weighted), handling of weights and edge-case validation, and engineering trade-offs in data-structure-aware implementation for both integer ranges and explicit collections.

Part 1: Deterministic Choice With Replacement

Implement deterministic sampling with replacement for a numpy-like `choice`. Instead of generating random numbers inside the function, you are given `random_values`, a list of floats in `[0, 1)`. If `p` is `None`, sample uniformly. Otherwise, use `p` as weights. If `a` is an integer `n`, sample from `[0, 1, ..., n-1]`; if `a` is a list/tuple, sample from its items. If `size` is `None`, return one sampled item; otherwise return a list of `size` sampled items. For weighted sampling, a random value `r` selects the first cumulative probability strictly greater than `r`.

Constraints

If `a` is an integer, `1 <= a <= 10^5`.
If `a` is a list/tuple, `1 <= len(a) <= 10^5`.
`size` is `None` or `0 <= size <= 10^5`.
`random_values` contains at least 1 value when `size is None`, otherwise at least `size` values.
Each random value satisfies `0 <= r < 1`.
If `p` is provided, it has the same length as the population, all weights are non-negative, and the total weight is positive.

Examples

Input: (5, 4, None, [0.02, 0.4, 0.99, 0.6])

Expected Output: [0, 2, 4, 3]

Explanation: Uniform sampling from `[0,1,2,3,4]` with indices `0,2,4,3`.

Input: (['a', 'b', 'c'], 5, [0.1, 0.3, 0.6], [0.05, 0.1, 0.39, 0.4, 0.95])

Expected Output: ['a', 'b', 'b', 'c', 'c']

Explanation: Cumulative probabilities are `[0.1, 0.4, 1.0]`; note that boundary values move to the next bucket because we choose the first cumulative value strictly greater than `r`.

Input: (['x'], None, None, [0.73])

Expected Output: 'x'

Explanation: Edge case: single-element population and `size=None`.

Input: (4, None, [0.25, 0.25, 0.25, 0.25], [0.74])

Expected Output: 2

Explanation: Weighted but uniform probabilities; `0.74` falls into the third bucket.

Hints

For uniform sampling with replacement, map `r` to an index with `int(r * n)`.
For weighted sampling, build cumulative probabilities once, then use binary search for each draw.

Part 2: Deterministic Choice Without Replacement

Implement deterministic sampling without replacement for a numpy-like `choice`. You are given `random_values`, a list of floats in `[0, 1)` that must be used in order. After each draw, the chosen item is removed from future draws. If `p` is `None`, sample uniformly from the remaining items. Otherwise, use the remaining weights, renormalizing them after each removal. If `a` is an integer `n`, sample from `[0, 1, ..., n-1]`; if `a` is a list/tuple, sample from its items. If `size` is `None`, return one sampled item; otherwise return a list of sampled items.

Constraints

If `a` is an integer, `1 <= a <= 2000`.
If `a` is a list/tuple, `1 <= len(a) <= 2000`.
`size` is `None` or `0 <= size <= n`, where `n` is the population size.
Each random value satisfies `0 <= r < 1`.
If `p` is provided, it has the same length as the population, all weights are non-negative, the total weight is positive, and there are enough positive-weight items to finish all draws.

Examples

Input: (5, 3, None, [0.1, 0.7, 0.2])

Expected Output: [0, 3, 1]

Explanation: Remaining items shrink after each draw: `[0,1,2,3,4] -> [1,2,3,4] -> [1,2,4]`.

Input: (['a', 'b', 'c'], 3, [0.2, 0.5, 0.3], [0.6, 0.1, 0.8])

Expected Output: ['b', 'a', 'c']

Explanation: Draw `b` first, then renormalize weights of `a` and `c`, then continue.

Input: (4, 2, [0.1, 0.2, 0.3, 0.4], [0.95, 0.4])

Expected Output: [3, 1]

Explanation: First draw picks item `3`; remaining weights become `[0.1, 0.2, 0.3]` and are renormalized.

Input: (['x'], None, None, [0.99])

Expected Output: 'x'

Explanation: Edge case: one element and a single draw.

Hints

For the uniform case, think of choosing an index from the remaining array, then removing that item.
For the weighted case, after removing an item you must also remove its weight and rebuild the cumulative distribution for the next draw.

Part 3: Validate and Analyze a Choice Call

Write a validator/analyzer for a numpy-like `choice` call. Implement `solution(a, size, replace, p)`. If the parameters are invalid, return one of these error code strings: `'invalid_a'`, `'invalid_size'`, `'invalid_p_length'`, `'invalid_p_values'`, or `'sample_too_large'`. Otherwise return a tuple `(n, k, mode, complexity, normalized_p)`, where `n` is the population size, `k` is the effective sample count (`1` if `size is None`), `mode` is one of `'uniform_with'`, `'weighted_with'`, `'uniform_without'`, `'weighted_without'`, `complexity` is the worst-case time complexity of the straightforward algorithms from Parts 1 and 2, and `normalized_p` is `None` or the normalized probability list rounded to 12 decimal places. If `p` is provided and sums to a positive value but not exactly 1, normalize it instead of rejecting it.

Constraints

`a` is either an integer or a list/tuple.
If `a` is an integer, it must be positive; if it is a list/tuple, it must be non-empty.
`size` must be `None` or a non-negative integer.
If `replace` is `False`, then `k <= n` must hold.
If `p` is provided, it must match the population length, contain only non-negative numbers, and have positive total sum.
For weighted sampling without replacement, there must be at least `k` positive-probability items.
Return normalized probabilities rounded to 12 decimal places.

Examples

Input: (5, None, True, None)

Expected Output: (5, 1, 'uniform_with', 'O(k)', None)

Explanation: Single uniform draw with replacement.

Input: (['a', 'b', 'c'], 2, True, [2, 1, 1])

Expected Output: (3, 2, 'weighted_with', 'O(n + k log n)', [0.5, 0.25, 0.25])

Explanation: The weights are normalized before analysis.

Input: ([10, 20], 3, False, None)

Expected Output: 'sample_too_large'

Explanation: Cannot draw 3 items without replacement from a population of 2.

Input: (['x', 'y'], 1, False, [0.3])

Expected Output: 'invalid_p_length'

Explanation: Probability list length must match the population size.

Input: (4, 2, False, [0, 0, 0, 0])

Expected Output: 'invalid_p_values'

Explanation: Probability weights must have positive total sum.

Hints

First normalize `a` into a population size `n`, and convert `size=None` into an effective sample count `k=1`.
Handle validation before deciding the mode and the associated complexity string.

Quick Overview