PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of random sampling algorithms, probability distributions (uniform and weighted), handling of weights and edge-case validation, and engineering trade-offs in data-structure-aware implementation for both integer ranges and explicit collections.

  • easy
  • Pinterest
  • Coding & Algorithms
  • Data Scientist

Implement np.random.choice

Company: Pinterest

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: easy

Interview Round: Technical Screen

Implement a function similar to `numpy.random.choice`. ## Function signature `choice(a, size=None, replace=True, p=None) -> samples` ## Requirements - `a` can be: - an integer `n` meaning sample from `[0, 1, ..., n-1]`, or - a list/array of items to sample from. - `size` can be: - `None` (return a single item), or - an integer `k` (return a list of length `k`). - `replace`: - if `True`, sampling is with replacement. - if `False`, sampling is without replacement (no duplicates). Assume `k <= len(a)`. - `p`: - optional probability weights aligned with `a`. - if `None`, use uniform distribution. ## Tasks 1. Describe an algorithm to sample **with replacement** (uniform and weighted). 2. Describe an algorithm to sample **without replacement** (uniform and weighted). 3. Discuss time complexity and edge cases (invalid `p`, floating point sum != 1, `size=None`, etc.).

Quick Answer: This question evaluates understanding of random sampling algorithms, probability distributions (uniform and weighted), handling of weights and edge-case validation, and engineering trade-offs in data-structure-aware implementation for both integer ranges and explicit collections.

Part 1: Deterministic Choice With Replacement

Implement deterministic sampling with replacement for a numpy-like `choice`. Instead of generating random numbers inside the function, you are given `random_values`, a list of floats in `[0, 1)`. If `p` is `None`, sample uniformly. Otherwise, use `p` as weights. If `a` is an integer `n`, sample from `[0, 1, ..., n-1]`; if `a` is a list/tuple, sample from its items. If `size` is `None`, return one sampled item; otherwise return a list of `size` sampled items. For weighted sampling, a random value `r` selects the first cumulative probability strictly greater than `r`.

Constraints

  • If `a` is an integer, `1 <= a <= 10^5`.
  • If `a` is a list/tuple, `1 <= len(a) <= 10^5`.
  • `size` is `None` or `0 <= size <= 10^5`.
  • `random_values` contains at least 1 value when `size is None`, otherwise at least `size` values.
  • Each random value satisfies `0 <= r < 1`.
  • If `p` is provided, it has the same length as the population, all weights are non-negative, and the total weight is positive.

Examples

Input: (5, 4, None, [0.02, 0.4, 0.99, 0.6])

Expected Output: [0, 2, 4, 3]

Explanation: Uniform sampling from `[0,1,2,3,4]` with indices `0,2,4,3`.

Input: (['a', 'b', 'c'], 5, [0.1, 0.3, 0.6], [0.05, 0.1, 0.39, 0.4, 0.95])

Expected Output: ['a', 'b', 'b', 'c', 'c']

Explanation: Cumulative probabilities are `[0.1, 0.4, 1.0]`; note that boundary values move to the next bucket because we choose the first cumulative value strictly greater than `r`.

Input: (['x'], None, None, [0.73])

Expected Output: 'x'

Explanation: Edge case: single-element population and `size=None`.

Input: (4, None, [0.25, 0.25, 0.25, 0.25], [0.74])

Expected Output: 2

Explanation: Weighted but uniform probabilities; `0.74` falls into the third bucket.

Hints

  1. For uniform sampling with replacement, map `r` to an index with `int(r * n)`.
  2. For weighted sampling, build cumulative probabilities once, then use binary search for each draw.

Part 2: Deterministic Choice Without Replacement

Implement deterministic sampling without replacement for a numpy-like `choice`. You are given `random_values`, a list of floats in `[0, 1)` that must be used in order. After each draw, the chosen item is removed from future draws. If `p` is `None`, sample uniformly from the remaining items. Otherwise, use the remaining weights, renormalizing them after each removal. If `a` is an integer `n`, sample from `[0, 1, ..., n-1]`; if `a` is a list/tuple, sample from its items. If `size` is `None`, return one sampled item; otherwise return a list of sampled items.

Constraints

  • If `a` is an integer, `1 <= a <= 2000`.
  • If `a` is a list/tuple, `1 <= len(a) <= 2000`.
  • `size` is `None` or `0 <= size <= n`, where `n` is the population size.
  • Each random value satisfies `0 <= r < 1`.
  • If `p` is provided, it has the same length as the population, all weights are non-negative, the total weight is positive, and there are enough positive-weight items to finish all draws.

Examples

Input: (5, 3, None, [0.1, 0.7, 0.2])

Expected Output: [0, 3, 1]

Explanation: Remaining items shrink after each draw: `[0,1,2,3,4] -> [1,2,3,4] -> [1,2,4]`.

Input: (['a', 'b', 'c'], 3, [0.2, 0.5, 0.3], [0.6, 0.1, 0.8])

Expected Output: ['b', 'a', 'c']

Explanation: Draw `b` first, then renormalize weights of `a` and `c`, then continue.

Input: (4, 2, [0.1, 0.2, 0.3, 0.4], [0.95, 0.4])

Expected Output: [3, 1]

Explanation: First draw picks item `3`; remaining weights become `[0.1, 0.2, 0.3]` and are renormalized.

Input: (['x'], None, None, [0.99])

Expected Output: 'x'

Explanation: Edge case: one element and a single draw.

Hints

  1. For the uniform case, think of choosing an index from the remaining array, then removing that item.
  2. For the weighted case, after removing an item you must also remove its weight and rebuild the cumulative distribution for the next draw.

Part 3: Validate and Analyze a Choice Call

Write a validator/analyzer for a numpy-like `choice` call. Implement `solution(a, size, replace, p)`. If the parameters are invalid, return one of these error code strings: `'invalid_a'`, `'invalid_size'`, `'invalid_p_length'`, `'invalid_p_values'`, or `'sample_too_large'`. Otherwise return a tuple `(n, k, mode, complexity, normalized_p)`, where `n` is the population size, `k` is the effective sample count (`1` if `size is None`), `mode` is one of `'uniform_with'`, `'weighted_with'`, `'uniform_without'`, `'weighted_without'`, `complexity` is the worst-case time complexity of the straightforward algorithms from Parts 1 and 2, and `normalized_p` is `None` or the normalized probability list rounded to 12 decimal places. If `p` is provided and sums to a positive value but not exactly 1, normalize it instead of rejecting it.

Constraints

  • `a` is either an integer or a list/tuple.
  • If `a` is an integer, it must be positive; if it is a list/tuple, it must be non-empty.
  • `size` must be `None` or a non-negative integer.
  • If `replace` is `False`, then `k <= n` must hold.
  • If `p` is provided, it must match the population length, contain only non-negative numbers, and have positive total sum.
  • For weighted sampling without replacement, there must be at least `k` positive-probability items.
  • Return normalized probabilities rounded to 12 decimal places.

Examples

Input: (5, None, True, None)

Expected Output: (5, 1, 'uniform_with', 'O(k)', None)

Explanation: Single uniform draw with replacement.

Input: (['a', 'b', 'c'], 2, True, [2, 1, 1])

Expected Output: (3, 2, 'weighted_with', 'O(n + k log n)', [0.5, 0.25, 0.25])

Explanation: The weights are normalized before analysis.

Input: ([10, 20], 3, False, None)

Expected Output: 'sample_too_large'

Explanation: Cannot draw 3 items without replacement from a population of 2.

Input: (['x', 'y'], 1, False, [0.3])

Expected Output: 'invalid_p_length'

Explanation: Probability list length must match the population size.

Input: (4, 2, False, [0, 0, 0, 0])

Expected Output: 'invalid_p_values'

Explanation: Probability weights must have positive total sum.

Hints

  1. First normalize `a` into a population size `n`, and convert `size=None` into an effective sample count `k=1`.
  2. Handle validation before deciding the mode and the associated complexity string.
Last updated: Apr 21, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Hierarchical Access Control for an Advertising Platform - Pinterest (medium)
  • First Word Matching Each Prefix Query - Pinterest (medium)
  • Maximize Boxes Stored Through One Entrance - Pinterest (medium)
  • Solve Multiple Coding Interview Problems - Pinterest (medium)
  • Implement a Sparse Matrix Class - Pinterest (medium)