PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates a data scientist's competence in statistical inference using bootstrap resampling, proficiency with numerical computing for large sample operations, and attention to performance optimization.

  • Medium
  • Pinterest
  • Coding & Algorithms
  • Data Scientist

Calculate 95% Bootstrap Confidence Interval for Order Values

Company: Pinterest

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

##### Scenario An e-commerce firm wants a 95% confidence interval for the average order value but only has a single historical sample of order amounts. ##### Question Given an array of past order values, write efficient Python code to return the 95% bootstrap confidence interval using 10,000 resamples. Explain your approach and any performance optimizations. ##### Hints Use vectorized resampling (np.random.choice) and percentile bounds; avoid Python loops.

Quick Answer: This question evaluates a data scientist's competence in statistical inference using bootstrap resampling, proficiency with numerical computing for large sample operations, and attention to performance optimization.

Given a non-empty list of historical order values (floats), compute a two-sided 95% bootstrap confidence interval for the mean using exactly 10,000 resamples with replacement. Use NumPy's Generator-based RNG for reproducibility: numpy.random.default_rng(seed).choice. Return the 2.5th and 97.5th percentile bounds of the bootstrap sample means as a list [low, high], rounded to 6 decimal places. If the list has one unique value, the interval is that value for both bounds.

Constraints

  • 1 <= len(order_values) <= 5000
  • Order values are finite floats (can be zero or positive)
  • Use exactly B = 10,000 bootstrap resamples with replacement
  • RNG must be numpy.random.default_rng(seed) for determinism
  • Percentile bounds are [2.5, 97.5]
  • Return a list of two floats rounded to 6 decimals

Solution

def bootstrap_ci_95(order_values: list[float], seed: int = 42) -> list[float]:
    import numpy as np

    arr = np.asarray(order_values, dtype=float)
    if arr.size == 0:
        raise ValueError("order_values must be non-empty")

    B = 10000
    n = arr.size
    rng = np.random.default_rng(seed)

    # Choose a batch size to balance memory and speed
    # Ensures batch * n is bounded to keep memory reasonable
    max_draws = 5_000_000  # adjust as needed for environment
    batch = max(1, min(B, int(max_draws // max(1, n))))

    means = np.empty(B, dtype=float)
    start = 0
    while start < B:
        bs = min(batch, B - start)
        samples = rng.choice(arr, size=(bs, n), replace=True)
        means[start:start + bs] = samples.mean(axis=1)
        start += bs

    low, high = np.percentile(means, [2.5, 97.5])
    return [round(float(low), 6), round(float(high), 6)]
Explanation
Convert the input to a NumPy array. Use default_rng(seed) for reproducible sampling. Generate bootstrap resamples with replacement in vectorized batches to manage memory. For each batch, compute row-wise means and store them. After collecting 10,000 bootstrap means, compute the 2.5th and 97.5th percentiles to form the two-sided 95% confidence interval. Round both bounds to 6 decimals before returning.

Time complexity: O(B * n). Space complexity: O(B + batch * n).

Hints

  1. Use numpy.random.default_rng(seed).choice to generate resamples in a vectorized way.
  2. Compute means along axis=1 and then np.percentile at [2.5, 97.5].
  3. To limit memory, generate resamples in batches (e.g., 1000 at a time) while keeping vectorization within each batch.
Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Assign Pins to Shortest Columns - Pinterest (medium)
  • Design Hierarchical Permission Checks - Pinterest (medium)
  • Implement weighted random choice - Pinterest (medium)
  • Solve five hard algorithm problems - Pinterest
  • Sample a string by real-valued scores - Pinterest (hard)