Implement scalable prime generator
Company: Pinterest
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: This question evaluates a candidate's ability to implement scalable prime generation and related algorithmic optimization, including algorithm selection, time and memory complexity reasoning, edge-case handling, and verification via unit tests.
Constraints
- 0 ≤ n ≤ 100000
- Use a sieve-based approach; repeated trial division per candidate is not efficient enough
- Aim for near-linear memory usage and about O(L log log L) time, where L is the sieve limit
Examples
Input: (0, False)
Expected Output: []
Explanation: There are no primes to return when n is 0.
Input: (1, False)
Expected Output: [2]
Explanation: The first prime number is 2.
Input: (5, False)
Expected Output: [2, 3, 5, 7, 11]
Explanation: The first five primes in ascending order are 2, 3, 5, 7, and 11.
Input: (10, False)
Expected Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Explanation: These are the first ten prime numbers.
Input: (100, False)
Expected Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541]
Explanation: The 100th prime is 541, and the full returned list contains the first 100 primes.
Input: (100000, True)
Expected Output: [1299709]
Explanation: In compact mode, return only the 100000th prime. The known value is 1299709.
Solution
def solution(n, return_last_only=False):
import math
if n < 0:
raise ValueError("n must be non-negative")
if n == 0:
return []
if n == 1:
return [2]
def upper_bound(k):
if k < 6:
return 15
return int(k * (math.log(k) + math.log(math.log(k)))) + 10
limit = upper_bound(n)
while True:
size = (limit - 1) // 2
sieve = bytearray(b"\x01") * size
root = math.isqrt(limit)
max_idx = (root - 3) // 2
for i in range(max_idx + 1):
if sieve[i]:
p = 2 * i + 3
start = (p * p - 3) // 2
sieve[start::p] = b"\x00" * (((size - 1 - start) // p) + 1)
if return_last_only:
count = 1
for i, is_prime in enumerate(sieve):
if is_prime:
count += 1
if count == n:
return [2 * i + 3]
else:
primes = [2]
primes.extend(2 * i + 3 for i, is_prime in enumerate(sieve) if is_prime)
if len(primes) >= n:
return primes[:n]
limit *= 2Time complexity: O(L log log L), where L is the sieve bound and L is about n(log n + log log n). In practice, the biggest constants come from marking composite multiples in the sieve and scanning the sieve to build the answer.. Space complexity: O(L), using an odd-only sieve so the main array stores about L/2 candidate slots..
Hints
- For n ≥ 6, the nth prime is less than about n(log n + log log n), which is a good starting bound for your sieve.
- You can cut memory roughly in half by storing only odd numbers in the sieve.