How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Onsite rounds at Citadel.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Citadel during technical interviews.

Design dynamic weighted random sampling with updates

Quick Overview

This question evaluates a candidate's ability to design and analyze dynamic data structures that support weighted random sampling with inserts and deletes, testing skills in algorithm design, complexity analysis, and randomized sampling under large numeric constraints.

Company: Citadel

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

## Problem: Support weighted random sampling with insert/delete Design a data structure that maintains a dynamic set of items, each with a **positive integer weight**, and supports: 1. `insert(id, weight)` - Adds a new item `id` with the given `weight`. - If `id` already exists, you may either reject or treat it as an update (state your choice). 2. `delete(id)` - Removes the item if it exists. 3. `sample() -> id` - Returns an `id` chosen randomly such that: \[ P(id) = \frac{weight(id)}{\sum_j weight(j)} \] ### Requirements - Target time complexity: **O(log n)** per operation. - Must handle up to ~10^5 operations. - Weights can be large (e.g., up to 10^9); total weight may exceed 32-bit. - `sample()` should be unbiased given an ideal RNG. ### Notes - You may assume access to a function `randInt(1, totalWeight)` that returns a uniform integer in that range. - Clearly define how you map item IDs to internal indices as the set changes over time.

Quick Answer: This question evaluates a candidate's ability to design and analyze dynamic data structures that support weighted random sampling with inserts and deletes, testing skills in algorithm design, complexity analysis, and randomized sampling under large numeric constraints.

You are given a sequence of operations on a dynamic set of integer item IDs. Each active item has a positive integer weight. Implement a function that processes all operations and returns the result of every sample query. Operations: - ("insert", id, weight): Insert a new item or update the existing item's weight to the new positive value. - ("delete", id): Remove the item if it exists. If it does not exist, do nothing. - ("sample", r): Simulate weighted random sampling using a pre-generated random integer r. To make sampling deterministic for testing, active items are considered in increasing ID order. If their current weights are w1, w2, ..., then they occupy cumulative intervals [1, w1], [w1+1, w1+w2], and so on. A sample query returns the ID whose interval contains r. If the set is empty, or r is outside the range [1, totalWeight], return -1 for that sample. Your goal is to support up to about 10^5 operations efficiently.

Constraints

1 <= len(operations) <= 100000
1 <= weight <= 10^9 for insert operations
IDs are integers and may be sparse or large in magnitude
The total active weight can exceed 32-bit integer range
Insert on an existing ID must be treated as a weight update
Delete on a missing ID is a no-op

Examples

Input: [("insert", 10, 4), ("insert", 20, 6), ("sample", 1), ("sample", 4), ("sample", 5), ("sample", 10)]

Expected Output: [10, 10, 20, 20]

Explanation: In increasing ID order, 10 owns [1,4] and 20 owns [5,10].

Input: [("insert", 5, 3), ("insert", 2, 2), ("sample", 4), ("insert", 5, 1), ("sample", 2), ("delete", 2), ("sample", 1), ("delete", 5), ("sample", 1)]

Expected Output: [5, 2, 5, -1]

Explanation: Insert on existing ID updates its weight. After all deletions, sampling from an empty set returns -1.

Input: [("delete", 7), ("insert", 7, 1000000000), ("sample", 1), ("sample", 1000000000), ("insert", 3, 1000000000), ("sample", 1000000000), ("sample", 1000000001)]

Expected Output: [7, 7, 3, 7]

Explanation: This checks large weights and boundary values. After inserting 3, IDs are ordered as 3 then 7.

Input: [("insert", 1, 5), ("insert", 4, 2), ("insert", 3, 4), ("delete", 4), ("sample", 8), ("insert", 4, 10), ("sample", 9), ("insert", 1, 1), ("sample", 1), ("sample", 5)]

Expected Output: [3, 3, 1, 3]

Explanation: This case mixes inserts, updates, deletes, and boundary prefix sums.

Input: [("sample", 1), ("sample", 2)]

Expected Output: [-1, -1]

Explanation: Sampling from an always-empty structure returns -1 each time.

Solution

def solution(operations):
    ids = sorted({op[1] for op in operations if op[0] in ("insert", "delete")})
    n = len(ids)
    index = {item_id: i + 1 for i, item_id in enumerate(ids)}

    bit = [0] * (n + 1)
    current = {}
    total_weight = 0

    def add(i, delta):
        nonlocal total_weight
        total_weight += delta
        while i <= n:
            bit[i] += delta
            i += i & -i

    def find_by_prefix(k):
        idx = 0
        step = 1 << (n.bit_length() - 1)
        while step:
            nxt = idx + step
            if nxt <= n and bit[nxt] < k:
                idx = nxt
                k -= bit[nxt]
            step >>= 1
        return idx + 1

    answer = []

    for op in operations:
        typ = op[0]

        if typ == "insert":
            _, item_id, weight = op
            old = current.get(item_id, 0)
            current[item_id] = weight
            add(index[item_id], weight - old)

        elif typ == "delete":
            _, item_id = op
            old = current.pop(item_id, 0)
            if old:
                add(index[item_id], -old)

        else:  # sample
            _, r = op
            if total_weight <= 0 or r < 1 or r > total_weight:
                answer.append(-1)
            else:
                pos = find_by_prefix(r)
                answer.append(ids[pos - 1])

    return answer

Time complexity: O(m log m) overall for m operations, with O(log m) per update/sample after coordinate compression. Space complexity: O(m).

Hints

Compress item IDs into fixed indices, then store current weights in a Fenwick tree or segment tree.
A sample query is equivalent to finding the first index whose prefix-sum weight is at least r.

Quick Overview