PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question tests a candidate's ability to implement numerically stable online algorithms for computing Shannon entropy over streaming data. It evaluates mastery of floating-point precision, O(1) incremental state maintenance, and information-theoretic concepts applied to unbounded sequences — skills central to machine learning engineering roles.

  • hard
  • OpenAI
  • Coding & Algorithms
  • Machine Learning Engineer

Streaming Entropy with Numerical Stability

Company: OpenAI

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

## Streaming Entropy with Numerical Stability You are processing a very large, possibly unbounded, stream of category labels (for example, token IDs emitted by a language model, or event types in a telemetry pipeline). You cannot hold the full stream in memory, and the number of distinct categories may be large but is far smaller than the total number of observations. Your task is to maintain, in an online fashion, the **Shannon entropy** of the empirical distribution of the labels seen so far. After observing $n$ labels where category $i$ has been seen $c_i$ times (so $\sum_i c_i = n$), the empirical probability of category $i$ is $p_i = c_i / n$, and the Shannon entropy in **nats** (natural log) is: $$H = -\sum_{i} p_i \ln p_i = -\sum_{i:\,c_i>0} \frac{c_i}{n} \ln \frac{c_i}{n}.$$ Implement a `StreamingEntropy` class (or equivalent) that supports: 1. `update(label)` — incorporate one observed label into the running statistics. 2. `entropy()` — return the current Shannon entropy in nats of all labels observed so far. Before any label is observed, the entropy is defined to be $0.0$. **Complexity requirements.** Each `update` must run in $O(1)$ amortized time independent of the number of distinct categories. Each `entropy()` must run in $O(1)$ time — it must not re-scan all category counts. **Numerical-stability requirements.** A naive implementation that recomputes $-\sum_i (c_i/n)\ln(c_i/n)$ by iterating over all counts on every `entropy()` call is too slow and is the baseline you must improve on. The representation you maintain must be robust to: - Categories whose count is $0$ must never contribute a $0 \cdot \ln 0$ term (which is undefined); the limit of $p \ln p$ as $p \to 0$ is $0$ and must be used. - Computing $\ln(c_i / n)$ as the log of a very small floating-point number should be avoided where it loses precision; prefer algebraically equivalent forms. - The running state used to derive entropy must not suffer catastrophic precision loss as $n$ grows into the billions and many tiny per-category contributions accumulate. Your solution must pass tests at $n \approx 10^9$ to an absolute tolerance of $10^{-6}$ nats; a plain double-precision accumulation that is not numerically compensated will not meet this bar. ### Part 1 — Single-label updates Design and implement `StreamingEntropy` meeting the constraints above for single-label ingestion. You may maintain whatever internal state you choose, but the per-`update` and `entropy()` time bounds must hold. ### Part 2 — Block-wise (batched) updates In the second variant, labels do not arrive one at a time. Instead they arrive in **blocks**: each call delivers a batch of (label, count) pairs — for example, a block summary `{"a": 1000, "b": 3, "c": 50000}` meaning category `a` was seen 1000 times, `b` 3 times, and `c` 50000 times in that block. Add `update_block(block)` where `block` is a collection of (label, count) pairs with positive integer counts. A block with $k$ distinct categories must be incorporated in $O(k)$ time — you may not re-scan the full set of distinct categories accumulated across all prior blocks. The entropy returned after a sequence of blocks must equal the entropy of the merged multiset of all labels, computed as if every label had been streamed one at a time. ### Input / Output Implement the class with the methods above. The harness will issue a sequence of operations and assert that `entropy()` matches the reference value (in nats) to within $10^{-6}$ absolute, including the empty-stream case ($H = 0.0$) and single-category cases ($H = 0.0$ when all mass is on one label).

Quick Answer: This question tests a candidate's ability to implement numerically stable online algorithms for computing Shannon entropy over streaming data. It evaluates mastery of floating-point precision, O(1) incremental state maintenance, and information-theoretic concepts applied to unbounded sequences — skills central to machine learning engineering roles.

Streaming Entropy (Single-Label Updates)

Maintain the Shannon entropy (in **nats**, natural log) of the empirical distribution of category labels seen so far in an online stream you cannot fully hold in memory.\n\nYou are given a list `operations`, each one of:\n- `["update", label]` — ingest one observed label.\n- `["entropy"]` — record the current entropy of everything seen so far (defined as `0.0` before any label).\n\nReturn the list of recorded entropy values, one per `"entropy"` op.\n\nAfter `n` labels with category `i` seen `c_i` times, `p_i = c_i / n` and `H = -sum_i p_i ln p_i`.\n\n**Constraints.** Each `update` must be O(1) amortized (independent of the number of distinct categories) and each `entropy()` must be O(1) — it may not re-scan all counts. A 0-count category contributes 0 (the p->0 limit of p ln p). Avoid taking the log of a tiny `c_i/n` directly; prefer the algebraically equivalent `H = ln(n) - (sum_i c_i ln c_i)/n`. The running state must stay accurate to 1e-6 nats even for `n` up to ~1e9, so a plain uncompensated double accumulation will not pass — use compensated (Kahan/Neumaier) summation.

Constraints

  • Entropy of the empty stream is 0.0.
  • Entropy is 0.0 whenever all mass is on a single category.
  • Each update() is O(1) amortized, independent of the number of distinct categories.
  • Each entropy() is O(1) and must not re-scan the per-category counts.
  • A category with count 0 contributes 0 (limit of p ln p as p -> 0).
  • Results must be within 1e-6 nats of the reference, including at n ~ 1e9.

Examples

Input: ([['entropy']],)

Expected Output: [0.0]

Explanation: Empty stream: entropy of nothing is defined as 0.0.

Input: ([['update', 'x'], ['update', 'x'], ['update', 'x'], ['entropy']],)

Expected Output: [0.0]

Explanation: All mass on one category -> entropy 0 (ln 1 = 0).

Input: ([['update', 'a'], ['update', 'b'], ['entropy']],)

Expected Output: [0.693147181]

Explanation: Two categories each with count 1 -> H = ln 2 = 0.6931... nats.

Input: ([['entropy'], ['update', 'a'], ['entropy'], ['update', 'b'], ['update', 'b'], ['entropy'], ['update', 'c'], ['entropy']],)

Expected Output: [0.0, 0.0, 0.636514168, 1.039720771]

Explanation: Progressive updates; entropy queried at several points along the stream.

Input: ([['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'green'], ['update', 'green'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'red'], ['update', 'green'], ['update', 'yellow'], ['update', 'green'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'yellow'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'blue'], ['update', 'blue'], ['update', 'blue'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'blue'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'blue'], ['update', 'red'], ['update', 'blue'], ['update', 'yellow'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'blue'], ['update', 'green'], ['update', 'red'], ['update', 'blue'], ['update', 'green'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'red'], ['update', 'blue'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'green'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'red'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'green'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'green'], ['update', 'yellow'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'green'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'green'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'green'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'yellow'], ['update', 'red'], ['update', 'green'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'blue'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'green'], ['update', 'blue'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'yellow'], ['update', 'blue'], ['update', 'red'], ['update', 'red'], ['update', 'blue'], ['update', 'green'], ['update', 'blue'], ['update', 'green'], ['update', 'yellow'], ['update', 'yellow'], ['update', 'red'], ['update', 'red'], ['entropy']],)

Expected Output: [1.378517711]

Explanation: 300 updates over 4 labels; final empirical entropy near ln 4.

Hints

  1. Rewrite H = -sum (c_i/n) ln(c_i/n) = ln(n) - (1/n) sum c_i ln(c_i). Maintain only S = sum c_i ln c_i and the total n; then entropy is ln(n) - S/n in O(1).
  2. On update of a category from count c to c+1, S changes by (c+1)ln(c+1) - c ln c — an O(1) delta. No need to touch any other category.
  3. Naive double accumulation of S drifts past 1e-6 by n~1e9. Use Neumaier/Kahan compensated summation (keep a running compensation term) so the many small +/- deltas don't lose low-order bits.

Streaming Entropy (Block-Wise / Batched Updates)

Extend the streaming-entropy tracker so that labels can also arrive in **blocks**. `operations` now contains, in addition to the Part-1 ops, `["update_block", block]` where `block` is a mapping of `{label: count}` with positive integer counts — e.g. `{"a": 1000, "b": 3, "c": 50000}` means `a` was seen 1000 times, `b` 3 times, `c` 50000 times in that block.\n\nSupported ops:\n- `["update", label]` — ingest one label (as in Part 1).\n- `["update_block", {label: count, ...}]` — ingest a whole block.\n- `["entropy"]` — record the current entropy (0.0 before any label).\n\nReturn the list of recorded entropy values, one per `"entropy"` op.\n\nA block with `k` distinct categories must be incorporated in O(k) time — you may not re-scan the full set of categories accumulated across all prior blocks. The entropy after a sequence of blocks must equal the entropy of the merged multiset of all labels, exactly as if every label had been streamed one at a time. The same numerical-stability bar applies: accurate to 1e-6 nats with totals up to ~1e9, using compensated accumulation.

Constraints

  • update_block(block) incorporates a k-category block in O(k) time.
  • Entropy after any sequence of blocks equals the entropy of the merged multiset of all labels.
  • Single update() stays O(1); entropy() stays O(1) and must not re-scan counts.
  • Empty stream -> 0.0; all mass on one label -> 0.0.
  • Results within 1e-6 nats of the reference, including totals up to ~1e9.

Examples

Input: ([['entropy']],)

Expected Output: [0.0]

Explanation: Empty stream -> 0.0, even with block API available.

Input: ([['update_block', {'x': 1000000}], ['entropy']],)

Expected Output: [0.0]

Explanation: A single block puts all mass on one label -> entropy 0.

Input: ([['entropy'], ['update_block', {'a': 1000, 'b': 3, 'c': 50000}], ['entropy'], ['update', 'a'], ['update_block', {'d': 2, 'a': 7}], ['entropy'], ['update_block', {'b': 1}], ['entropy']],)

Expected Output: [0.0, 0.09713507, 0.098169038, 0.098355111]

Explanation: Interleaved blocks and a single update; entropy equals that of the merged multiset.

Input: ([['update', 'a'], ['update', 'a'], ['update_block', {'a': 1, 'b': 3}], ['entropy']],)

Expected Output: [0.693147181]

Explanation: Single updates then a block touching the same label; result matches streaming all labels one at a time.

Input: ([['update_block', {'cat0': 1179126, 'cat1': 816353, 'cat2': 1328004, 'cat3': 1865108, 'cat4': 601263, 'cat5': 651909, 'cat6': 1623826, 'cat7': 697405, 'cat8': 1266905, 'cat9': 1722195, 'cat10': 621632, 'cat11': 1564169, 'cat12': 950254, 'cat13': 578634, 'cat14': 680244, 'cat15': 1409420, 'cat16': 1376970, 'cat17': 646497, 'cat18': 1004706, 'cat19': 690238, 'cat20': 1655629, 'cat21': 1390281, 'cat22': 623963, 'cat23': 1685842, 'cat24': 759631, 'cat25': 968166, 'cat26': 1822518, 'cat27': 1815822, 'cat28': 1722633, 'cat29': 629734, 'cat30': 1710272, 'cat31': 1727969, 'cat32': 1331899, 'cat33': 603996, 'cat34': 963642, 'cat35': 597690, 'cat36': 1667410, 'cat37': 779287, 'cat38': 1107354, 'cat39': 1378998, 'cat40': 802524, 'cat41': 1633900, 'cat42': 747028, 'cat43': 1697292, 'cat44': 1146933, 'cat45': 1674944, 'cat46': 1930263, 'cat47': 879010, 'cat48': 716123, 'cat49': 1719703, 'cat50': 1697902, 'cat51': 1839898, 'cat52': 893994, 'cat53': 1280974, 'cat54': 704326, 'cat55': 1648703, 'cat56': 1993404, 'cat57': 631678, 'cat58': 1683566, 'cat59': 624992, 'cat60': 1798157, 'cat61': 931926, 'cat62': 1541056, 'cat63': 1926902, 'cat64': 1615098, 'cat65': 1396726, 'cat66': 1158814, 'cat67': 1476437, 'cat68': 1728012, 'cat69': 1450396, 'cat70': 1258293, 'cat71': 1128656, 'cat72': 1020988, 'cat73': 876998, 'cat74': 1965897, 'cat75': 1011907, 'cat76': 671662, 'cat77': 1704653, 'cat78': 1129668, 'cat79': 1601416, 'cat80': 1538334, 'cat81': 1220320, 'cat82': 1441273, 'cat83': 1103849, 'cat84': 1777079, 'cat85': 653513, 'cat86': 747601, 'cat87': 1573600, 'cat88': 1376867, 'cat89': 845950, 'cat90': 1217343, 'cat91': 818734, 'cat92': 1525429, 'cat93': 1384365, 'cat94': 582223, 'cat95': 1901350, 'cat96': 662781, 'cat97': 1670369, 'cat98': 1701722, 'cat99': 1157976, 'cat100': 1213288, 'cat101': 1958140, 'cat102': 1234377, 'cat103': 1746483, 'cat104': 1541602, 'cat105': 1716128, 'cat106': 1456731, 'cat107': 644206, 'cat108': 696285, 'cat109': 1066103, 'cat110': 1494256, 'cat111': 1961803, 'cat112': 1892828, 'cat113': 636314, 'cat114': 627233, 'cat115': 1971135, 'cat116': 1149293, 'cat117': 1857127, 'cat118': 1712041, 'cat119': 1928657, 'cat120': 1434576, 'cat121': 1096840, 'cat122': 1309063, 'cat123': 1902266, 'cat124': 1227722, 'cat125': 547317, 'cat126': 1468245, 'cat127': 1245462, 'cat128': 852422, 'cat129': 1781191, 'cat130': 745567, 'cat131': 1535349, 'cat132': 623636, 'cat133': 957614, 'cat134': 1102788, 'cat135': 771246, 'cat136': 1019285, 'cat137': 1334451, 'cat138': 1319880, 'cat139': 1541250, 'cat140': 668991, 'cat141': 848895, 'cat142': 1442014, 'cat143': 1342309, 'cat144': 1652259, 'cat145': 1082670, 'cat146': 787154, 'cat147': 1402869, 'cat148': 1653894, 'cat149': 1083891, 'cat150': 1981421, 'cat151': 1370939, 'cat152': 1252397, 'cat153': 1931774, 'cat154': 1297843, 'cat155': 983920, 'cat156': 816504, 'cat157': 674031, 'cat158': 869555, 'cat159': 817295, 'cat160': 986448, 'cat161': 1881009, 'cat162': 989341, 'cat163': 525298, 'cat164': 1517040, 'cat165': 1735481, 'cat166': 882400, 'cat167': 1051019, 'cat168': 1091251, 'cat169': 508584, 'cat170': 805505, 'cat171': 1378594, 'cat172': 1621118, 'cat173': 1274380, 'cat174': 1778869, 'cat175': 1687703, 'cat176': 1168177, 'cat177': 763174, 'cat178': 1948070, 'cat179': 1581063, 'cat180': 1795185, 'cat181': 1873564, 'cat182': 1918094, 'cat183': 613231, 'cat184': 1457651, 'cat185': 1927269, 'cat186': 1672877, 'cat187': 1322878, 'cat188': 1334812, 'cat189': 1336719, 'cat190': 1326529, 'cat191': 717133, 'cat192': 1509826, 'cat193': 1830201, 'cat194': 1339789, 'cat195': 630543, 'cat196': 899737, 'cat197': 641238, 'cat198': 937808, 'cat199': 1424061, 'cat200': 840374, 'cat201': 730536, 'cat202': 1213144, 'cat203': 1759816, 'cat204': 610259, 'cat205': 714705, 'cat206': 500489, 'cat207': 1688631, 'cat208': 817225, 'cat209': 1625370, 'cat210': 712786, 'cat211': 1262545, 'cat212': 1787100, 'cat213': 553479, 'cat214': 647462, 'cat215': 936108, 'cat216': 1787796, 'cat217': 1289010, 'cat218': 811532, 'cat219': 1830453, 'cat220': 1029022, 'cat221': 1228528, 'cat222': 1763071, 'cat223': 1263706, 'cat224': 1494367, 'cat225': 757618, 'cat226': 741913, 'cat227': 1523552, 'cat228': 1477250, 'cat229': 1507461, 'cat230': 1514674, 'cat231': 1154001, 'cat232': 680113, 'cat233': 802236, 'cat234': 714302, 'cat235': 1218559, 'cat236': 1055235, 'cat237': 1503742, 'cat238': 1951348, 'cat239': 838561, 'cat240': 1582831, 'cat241': 548435, 'cat242': 930367, 'cat243': 1607836, 'cat244': 1258649, 'cat245': 807447, 'cat246': 1947176, 'cat247': 1639115, 'cat248': 556712, 'cat249': 1607525, 'cat250': 1125139, 'cat251': 1848294, 'cat252': 690862, 'cat253': 1960030, 'cat254': 1047598, 'cat255': 1587157, 'cat256': 1269025, 'cat257': 850312, 'cat258': 1245948, 'cat259': 967230, 'cat260': 1616927, 'cat261': 1635748, 'cat262': 1554232, 'cat263': 1191357, 'cat264': 1834715, 'cat265': 967752, 'cat266': 1786032, 'cat267': 909250, 'cat268': 1002032, 'cat269': 1340296, 'cat270': 975507, 'cat271': 919258, 'cat272': 1585567, 'cat273': 1533438, 'cat274': 1245668, 'cat275': 560775, 'cat276': 558588, 'cat277': 1085983, 'cat278': 1490359, 'cat279': 1043528, 'cat280': 906102, 'cat281': 1952323, 'cat282': 1769068, 'cat283': 1222009, 'cat284': 1437904, 'cat285': 1232995, 'cat286': 1264696, 'cat287': 668900, 'cat288': 962343, 'cat289': 714239, 'cat290': 975730, 'cat291': 1485829, 'cat292': 912522, 'cat293': 1208286, 'cat294': 928602, 'cat295': 1512197, 'cat296': 1808762, 'cat297': 1779812, 'cat298': 504002, 'cat299': 1505528, 'cat300': 1869394, 'cat301': 1221434, 'cat302': 1848747, 'cat303': 677793, 'cat304': 1885348, 'cat305': 751456, 'cat306': 1314818, 'cat307': 1992108, 'cat308': 918003, 'cat309': 1502507, 'cat310': 874387, 'cat311': 1410006, 'cat312': 1833457, 'cat313': 1197339, 'cat314': 681927, 'cat315': 1330133, 'cat316': 1471318, 'cat317': 1341769, 'cat318': 678088, 'cat319': 833145, 'cat320': 856523, 'cat321': 766418, 'cat322': 557774, 'cat323': 816985, 'cat324': 1739023, 'cat325': 1475917, 'cat326': 1875434, 'cat327': 806549, 'cat328': 1782562, 'cat329': 1749630, 'cat330': 1494798, 'cat331': 1878391, 'cat332': 1234857, 'cat333': 826972, 'cat334': 1650623, 'cat335': 1649838, 'cat336': 774693, 'cat337': 544872, 'cat338': 529869, 'cat339': 1862466, 'cat340': 715528, 'cat341': 1604320, 'cat342': 792029, 'cat343': 1409764, 'cat344': 908536, 'cat345': 942587, 'cat346': 558707, 'cat347': 1028135, 'cat348': 946231, 'cat349': 1114395, 'cat350': 1551012, 'cat351': 1004447, 'cat352': 1729847, 'cat353': 1183649, 'cat354': 1043927, 'cat355': 1641590, 'cat356': 1378733, 'cat357': 774881, 'cat358': 627726, 'cat359': 1241938, 'cat360': 1460833, 'cat361': 1889310, 'cat362': 1723371, 'cat363': 1583726, 'cat364': 1382121, 'cat365': 1552034, 'cat366': 774230, 'cat367': 1615317, 'cat368': 818423, 'cat369': 1597872, 'cat370': 1570694, 'cat371': 539226, 'cat372': 1423008, 'cat373': 884005, 'cat374': 1776231, 'cat375': 508247, 'cat376': 814158, 'cat377': 861437, 'cat378': 796870, 'cat379': 1492986, 'cat380': 1798349, 'cat381': 752364, 'cat382': 1667013, 'cat383': 629510, 'cat384': 1183634, 'cat385': 1930952, 'cat386': 1587056, 'cat387': 1613013, 'cat388': 1664846, 'cat389': 1511848, 'cat390': 722527, 'cat391': 1675026, 'cat392': 619165, 'cat393': 1021130, 'cat394': 901198, 'cat395': 1080737, 'cat396': 588497, 'cat397': 704986, 'cat398': 1564753, 'cat399': 1448281, 'cat400': 1678031, 'cat401': 558438, 'cat402': 632894, 'cat403': 1429559, 'cat404': 1182861, 'cat405': 1784564, 'cat406': 1560221, 'cat407': 1771162, 'cat408': 1574080, 'cat409': 918178, 'cat410': 1952762, 'cat411': 1081300, 'cat412': 1448637, 'cat413': 1565680, 'cat414': 1618380, 'cat415': 1502514, 'cat416': 1564832, 'cat417': 1019371, 'cat418': 1966366, 'cat419': 1597250, 'cat420': 1044404, 'cat421': 1673385, 'cat422': 924858, 'cat423': 1438534, 'cat424': 787591, 'cat425': 1373751, 'cat426': 755059, 'cat427': 1322847, 'cat428': 1427188, 'cat429': 1162657, 'cat430': 652140, 'cat431': 1907514, 'cat432': 1004656, 'cat433': 1398291, 'cat434': 653345, 'cat435': 946042, 'cat436': 1903984, 'cat437': 1134975, 'cat438': 756586, 'cat439': 823898, 'cat440': 1849429, 'cat441': 1884658, 'cat442': 1267942, 'cat443': 799848, 'cat444': 1030805, 'cat445': 787843, 'cat446': 1480913, 'cat447': 960509, 'cat448': 697395, 'cat449': 1335205, 'cat450': 1521859, 'cat451': 841406, 'cat452': 1900547, 'cat453': 969158, 'cat454': 838618, 'cat455': 1981267, 'cat456': 1404966, 'cat457': 1581302, 'cat458': 1346850, 'cat459': 1211178, 'cat460': 1383480, 'cat461': 910506, 'cat462': 1247875, 'cat463': 1167996, 'cat464': 693344, 'cat465': 1267458, 'cat466': 540858, 'cat467': 1208794, 'cat468': 1661927, 'cat469': 1461902, 'cat470': 1423707, 'cat471': 1974615, 'cat472': 537920, 'cat473': 1306028, 'cat474': 1195201, 'cat475': 1585137, 'cat476': 1808469, 'cat477': 1119612, 'cat478': 1574291, 'cat479': 634827, 'cat480': 736663, 'cat481': 979312, 'cat482': 719738, 'cat483': 676289, 'cat484': 1056928, 'cat485': 1070258, 'cat486': 583022, 'cat487': 880740, 'cat488': 1067166, 'cat489': 771697, 'cat490': 1385531, 'cat491': 1917619, 'cat492': 1042342, 'cat493': 1351334, 'cat494': 813247, 'cat495': 1625329, 'cat496': 1579577, 'cat497': 1696624, 'cat498': 1537276, 'cat499': 1968881, 'cat500': 1185870, 'cat501': 687615, 'cat502': 1085236, 'cat503': 620641, 'cat504': 1943271, 'cat505': 884500, 'cat506': 1391954, 'cat507': 651863, 'cat508': 1063973, 'cat509': 535298, 'cat510': 1830516, 'cat511': 685736, 'cat512': 1046416, 'cat513': 675620, 'cat514': 1775440, 'cat515': 966423, 'cat516': 639717, 'cat517': 1054593, 'cat518': 755176, 'cat519': 1451632, 'cat520': 524214, 'cat521': 1211252, 'cat522': 1659859, 'cat523': 1376106, 'cat524': 1061742, 'cat525': 1803806, 'cat526': 771004, 'cat527': 590608, 'cat528': 1605020, 'cat529': 1988006, 'cat530': 1000036, 'cat531': 729536, 'cat532': 838583, 'cat533': 1049234, 'cat534': 605653, 'cat535': 879890, 'cat536': 923138, 'cat537': 1154295, 'cat538': 1818418, 'cat539': 1139642, 'cat540': 1613767, 'cat541': 931743, 'cat542': 1108091, 'cat543': 1434673, 'cat544': 1548761, 'cat545': 1909615, 'cat546': 873083, 'cat547': 1067326, 'cat548': 1227713, 'cat549': 538090, 'cat550': 1025229, 'cat551': 577488, 'cat552': 532183, 'cat553': 538658, 'cat554': 1560433, 'cat555': 1655633, 'cat556': 897319, 'cat557': 1578428, 'cat558': 1495645, 'cat559': 1015227, 'cat560': 1437543, 'cat561': 722888, 'cat562': 1880597, 'cat563': 1863371, 'cat564': 1406342, 'cat565': 1876800, 'cat566': 1538093, 'cat567': 1644848, 'cat568': 1324361, 'cat569': 1562597, 'cat570': 1145467, 'cat571': 1942298, 'cat572': 951267, 'cat573': 981435, 'cat574': 1218703, 'cat575': 916545, 'cat576': 1982110, 'cat577': 1833740, 'cat578': 793011, 'cat579': 1348712, 'cat580': 1228869, 'cat581': 614061, 'cat582': 772249, 'cat583': 529895, 'cat584': 648316, 'cat585': 1811660, 'cat586': 1036019, 'cat587': 1403328, 'cat588': 842352, 'cat589': 616184, 'cat590': 677177, 'cat591': 1895083, 'cat592': 1298766, 'cat593': 1561039, 'cat594': 1906231, 'cat595': 1091256, 'cat596': 1755729, 'cat597': 1007957, 'cat598': 1952666, 'cat599': 1114588, 'cat600': 594869, 'cat601': 1463542, 'cat602': 888711, 'cat603': 830370, 'cat604': 1064210, 'cat605': 1434961, 'cat606': 507597, 'cat607': 1052060, 'cat608': 1263659, 'cat609': 1189808, 'cat610': 1647296, 'cat611': 1178499, 'cat612': 1012641, 'cat613': 572240, 'cat614': 1149169, 'cat615': 956897, 'cat616': 1247810, 'cat617': 883690, 'cat618': 502241, 'cat619': 1203243, 'cat620': 1300329, 'cat621': 675931, 'cat622': 1495399, 'cat623': 1084956, 'cat624': 1554372, 'cat625': 1875769, 'cat626': 921485, 'cat627': 1020469, 'cat628': 1558507, 'cat629': 510382, 'cat630': 690529, 'cat631': 1054001, 'cat632': 688226, 'cat633': 801707, 'cat634': 1337835, 'cat635': 1730610, 'cat636': 587381, 'cat637': 1326233, 'cat638': 547173, 'cat639': 1128403, 'cat640': 1138047, 'cat641': 1820513, 'cat642': 988237, 'cat643': 677173, 'cat644': 1728057, 'cat645': 1609790, 'cat646': 825587, 'cat647': 1878969, 'cat648': 1751075, 'cat649': 1316875, 'cat650': 1183954, 'cat651': 1536393, 'cat652': 813446, 'cat653': 1095961, 'cat654': 1797522, 'cat655': 1848928, 'cat656': 803567, 'cat657': 591830, 'cat658': 1999486, 'cat659': 1575799, 'cat660': 1815610, 'cat661': 1400191, 'cat662': 1970215, 'cat663': 1560196, 'cat664': 792149, 'cat665': 1598398, 'cat666': 1557742, 'cat667': 1692187, 'cat668': 533721, 'cat669': 1939635, 'cat670': 1724865, 'cat671': 1991465, 'cat672': 1932135, 'cat673': 1954010, 'cat674': 1848237, 'cat675': 982220, 'cat676': 678450, 'cat677': 565348, 'cat678': 587791, 'cat679': 779116, 'cat680': 1836137, 'cat681': 1256458, 'cat682': 720025, 'cat683': 1289825, 'cat684': 1446625, 'cat685': 1671316, 'cat686': 606494, 'cat687': 1816523, 'cat688': 539511, 'cat689': 1813292, 'cat690': 1614518, 'cat691': 1927456, 'cat692': 1012878, 'cat693': 1526124, 'cat694': 1053213, 'cat695': 506950, 'cat696': 1458290, 'cat697': 647034, 'cat698': 1554806, 'cat699': 1622395, 'cat700': 692816, 'cat701': 1882651, 'cat702': 1603081, 'cat703': 638517, 'cat704': 1493753, 'cat705': 1028888, 'cat706': 656132, 'cat707': 1056915, 'cat708': 992381, 'cat709': 930372, 'cat710': 983888, 'cat711': 1863006, 'cat712': 1465403, 'cat713': 1535885, 'cat714': 1302287, 'cat715': 660935, 'cat716': 1504557, 'cat717': 1933815, 'cat718': 1102551, 'cat719': 598036, 'cat720': 1793888, 'cat721': 1827062, 'cat722': 1847971, 'cat723': 915844, 'cat724': 662470, 'cat725': 1757673, 'cat726': 809172, 'cat727': 1195778, 'cat728': 1032551, 'cat729': 1866366, 'cat730': 1953088, 'cat731': 1138409, 'cat732': 1802647, 'cat733': 1690682, 'cat734': 779846, 'cat735': 526149, 'cat736': 1511709, 'cat737': 627215, 'cat738': 1518792, 'cat739': 1063657, 'cat740': 1909289, 'cat741': 708706, 'cat742': 1951617, 'cat743': 956536, 'cat744': 1917061, 'cat745': 1526795, 'cat746': 1109970, 'cat747': 1986610, 'cat748': 1583253, 'cat749': 1098828, 'cat750': 1474468, 'cat751': 1477058, 'cat752': 1477984, 'cat753': 748518, 'cat754': 1651496, 'cat755': 917857, 'cat756': 1153629, 'cat757': 680049, 'cat758': 1491837, 'cat759': 536709, 'cat760': 1107311, 'cat761': 1462531, 'cat762': 660357, 'cat763': 1562456, 'cat764': 1442566, 'cat765': 1063415, 'cat766': 1311279, 'cat767': 940060, 'cat768': 941888, 'cat769': 656474, 'cat770': 1719435, 'cat771': 689379, 'cat772': 797251, 'cat773': 1599045, 'cat774': 1049053, 'cat775': 1254038, 'cat776': 778092, 'cat777': 1765349, 'cat778': 1824704, 'cat779': 1566914, 'cat780': 1086296, 'cat781': 736301, 'cat782': 1975004, 'cat783': 1265855, 'cat784': 985246, 'cat785': 1544146, 'cat786': 1519510, 'cat787': 1326446, 'cat788': 552081, 'cat789': 833584, 'cat790': 507529, 'cat791': 1531161, 'cat792': 1929393, 'cat793': 1445312, 'cat794': 1350225, 'cat795': 1133237, 'cat796': 795084, 'cat797': 1372795, 'cat798': 1221336, 'cat799': 1288751, 'cat800': 1162863, 'cat801': 753565, 'cat802': 1194837, 'cat803': 503651, 'cat804': 1180625, 'cat805': 1209409, 'cat806': 1335210, 'cat807': 751744, 'cat808': 910498, 'cat809': 1995318, 'cat810': 524582, 'cat811': 1107823, 'cat812': 1031024, 'cat813': 1280606, 'cat814': 636267, 'cat815': 1323969, 'cat816': 1318227, 'cat817': 1735593, 'cat818': 660223, 'cat819': 1256462, 'cat820': 1397691, 'cat821': 1077042, 'cat822': 601225, 'cat823': 1088539, 'cat824': 713301, 'cat825': 608249, 'cat826': 1888268, 'cat827': 1098995, 'cat828': 1831614, 'cat829': 812296, 'cat830': 1022871, 'cat831': 1057272, 'cat832': 1414863, 'cat833': 1571567, 'cat834': 1161864, 'cat835': 898142, 'cat836': 1282970, 'cat837': 1397051, 'cat838': 560841, 'cat839': 1823084, 'cat840': 1338948, 'cat841': 1662143, 'cat842': 1651814, 'cat843': 926635, 'cat844': 668982, 'cat845': 603758, 'cat846': 1361690, 'cat847': 1445522, 'cat848': 1789568, 'cat849': 790607, 'cat850': 1851594, 'cat851': 1100222, 'cat852': 1518324, 'cat853': 602712, 'cat854': 1653660, 'cat855': 766991, 'cat856': 858115, 'cat857': 1490241, 'cat858': 1370038, 'cat859': 1220713, 'cat860': 1090864, 'cat861': 1124472, 'cat862': 1036331, 'cat863': 1869058, 'cat864': 1045614, 'cat865': 1351882, 'cat866': 1875721, 'cat867': 1000516, 'cat868': 1130899, 'cat869': 1513306, 'cat870': 1668788, 'cat871': 1902734, 'cat872': 1327049, 'cat873': 751118, 'cat874': 850921, 'cat875': 1848898, 'cat876': 839019, 'cat877': 657644, 'cat878': 935940, 'cat879': 1549844, 'cat880': 1542443, 'cat881': 1654244, 'cat882': 961426, 'cat883': 1449980, 'cat884': 1198004, 'cat885': 1443635, 'cat886': 1396371, 'cat887': 792754, 'cat888': 1648789, 'cat889': 903506, 'cat890': 1011884, 'cat891': 690242, 'cat892': 866362, 'cat893': 1217132, 'cat894': 1665752, 'cat895': 691038, 'cat896': 1169594, 'cat897': 1001484, 'cat898': 1272392, 'cat899': 1041815, 'cat900': 1694574, 'cat901': 923922, 'cat902': 542114, 'cat903': 1365665, 'cat904': 1302869, 'cat905': 1367976, 'cat906': 1599260, 'cat907': 940412, 'cat908': 1290344, 'cat909': 1066734, 'cat910': 1209263, 'cat911': 630148, 'cat912': 1544687, 'cat913': 1081993, 'cat914': 1704354, 'cat915': 1255279, 'cat916': 763976, 'cat917': 1940225, 'cat918': 1555697, 'cat919': 1609866, 'cat920': 1820423, 'cat921': 952907, 'cat922': 694192, 'cat923': 1068371, 'cat924': 1021045, 'cat925': 1306482, 'cat926': 1338351, 'cat927': 1854323, 'cat928': 1435032, 'cat929': 1405627, 'cat930': 1154345, 'cat931': 545739, 'cat932': 766856, 'cat933': 567619, 'cat934': 1391708, 'cat935': 1987954, 'cat936': 1492515, 'cat937': 1731398, 'cat938': 1527236, 'cat939': 500374, 'cat940': 653381, 'cat941': 1321079, 'cat942': 1607005, 'cat943': 1481784, 'cat944': 1441517, 'cat945': 1021069, 'cat946': 728686, 'cat947': 969343, 'cat948': 823754, 'cat949': 818910, 'cat950': 1595480, 'cat951': 1930415, 'cat952': 728359, 'cat953': 1970111, 'cat954': 1857587, 'cat955': 1459080, 'cat956': 678265, 'cat957': 1656581, 'cat958': 582934, 'cat959': 502864, 'cat960': 763510, 'cat961': 987748, 'cat962': 1694081, 'cat963': 578835, 'cat964': 1853723, 'cat965': 1999509, 'cat966': 1137077, 'cat967': 768365, 'cat968': 1813808, 'cat969': 1028050, 'cat970': 1607827, 'cat971': 1834399, 'cat972': 1417358, 'cat973': 1965032, 'cat974': 735159, 'cat975': 708550, 'cat976': 647538, 'cat977': 1129878, 'cat978': 1599823, 'cat979': 1722410, 'cat980': 902027, 'cat981': 1313866, 'cat982': 1047108, 'cat983': 968887, 'cat984': 1760516, 'cat985': 502415, 'cat986': 521939, 'cat987': 1627169, 'cat988': 1132335, 'cat989': 1466138, 'cat990': 1084274, 'cat991': 1163448, 'cat992': 1851773, 'cat993': 1008260, 'cat994': 1496785, 'cat995': 1603684, 'cat996': 992344, 'cat997': 1647146, 'cat998': 1018118, 'cat999': 561406}], ['entropy']],)

Expected Output: [6.844374138]

Explanation: One block of ~1000 categories totalling ~1.23e9 labels; entropy stable to 1e-6 via compensated accumulation.

Hints

  1. Reuse Part 1's invariant S = sum_i c_i ln c_i with total n; entropy is still ln(n) - S/n.
  2. Folding (label, k) in a block is the same delta as a single update generalized: S += (c+k)ln(c+k) - c ln c (only the old term if c>0), and n += k. That's O(1) per (label, k) pair, so O(k) per block — no rescan of other categories.
  3. Keep the same Neumaier-compensated accumulator for S across both single and block updates so a billion-label total still matches the reference to 1e-6.
Last updated: Jun 24, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Infection Spread Simulation with Death Threshold - OpenAI (medium)
  • Spreading Contagion on a Grid - OpenAI (medium)
  • Implement a Distributed Rate Limiter - OpenAI (medium)
  • Compute Plant Infection Time - OpenAI (medium)
  • Implement IP Address Arithmetic - OpenAI (hard)