Implement np.random.choice
Company: Pinterest
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: easy
Interview Round: Technical Screen
Implement a function similar to `numpy.random.choice`.
## Function signature
`choice(a, size=None, replace=True, p=None) -> samples`
## Requirements
- `a` can be:
- an integer `n` meaning sample from `[0, 1, ..., n-1]`, or
- a list/array of items to sample from.
- `size` can be:
- `None` (return a single item), or
- an integer `k` (return a list of length `k`).
- `replace`:
- if `True`, sampling is with replacement.
- if `False`, sampling is without replacement (no duplicates). Assume `k <= len(a)`.
- `p`:
- optional probability weights aligned with `a`.
- if `None`, use uniform distribution.
## Tasks
1. Describe an algorithm to sample **with replacement** (uniform and weighted).
2. Describe an algorithm to sample **without replacement** (uniform and weighted).
3. Discuss time complexity and edge cases (invalid `p`, floating point sum != 1, `size=None`, etc.).
Quick Answer: This question evaluates understanding of random sampling algorithms, probability distributions (uniform and weighted), handling of weights and edge-case validation, and engineering trade-offs in data-structure-aware implementation for both integer ranges and explicit collections.
Part 1: Deterministic Choice With Replacement
Implement deterministic sampling with replacement for a numpy-like `choice`. Instead of generating random numbers inside the function, you are given `random_values`, a list of floats in `[0, 1)`. If `p` is `None`, sample uniformly. Otherwise, use `p` as weights. If `a` is an integer `n`, sample from `[0, 1, ..., n-1]`; if `a` is a list/tuple, sample from its items. If `size` is `None`, return one sampled item; otherwise return a list of `size` sampled items. For weighted sampling, a random value `r` selects the first cumulative probability strictly greater than `r`.
Constraints
- If `a` is an integer, `1 <= a <= 10^5`.
- If `a` is a list/tuple, `1 <= len(a) <= 10^5`.
- `size` is `None` or `0 <= size <= 10^5`.
- `random_values` contains at least 1 value when `size is None`, otherwise at least `size` values.
- Each random value satisfies `0 <= r < 1`.
- If `p` is provided, it has the same length as the population, all weights are non-negative, and the total weight is positive.
Examples
Input: (5, 4, None, [0.02, 0.4, 0.99, 0.6])
Expected Output: [0, 2, 4, 3]
Explanation: Uniform sampling from `[0,1,2,3,4]` with indices `0,2,4,3`.