PracHub
QuestionsPremiumLearningGuidesCheatsheetNEW
|Home/Coding & Algorithms/Two Sigma

Implement binary encode/decode with frequency tree

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of frequency-based binary encoding and decoding, including deterministic tree construction, tie-breaking rules, and traversal, and tests knowledge of trees, priority queues, greedy algorithms, and bit-level string manipulation in the Coding & Algorithms domain.

  • hard
  • Two Sigma
  • Coding & Algorithms
  • Software Engineer

Implement binary encode/decode with frequency tree

Company: Two Sigma

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

## Problem Implement a **binary encoding** and **decoding** scheme for strings based on character frequencies (Huffman-like coding). You must implement two functions: 1. `encode(s)` → returns `(bitstring, tree)` 2. `decode(bitstring, tree)` → returns the original string ### Encoding rules Given an input string `s`: 1. Count the frequency of each character. 2. Create a leaf node for each distinct character with weight = its frequency. 3. Build a binary tree by repeatedly combining the **two nodes with the smallest weights** into a new parent node: - Parent weight = sum of the two child weights. - The two chosen nodes become the parent’s left and right children. 4. Assign edge labels: - left edge = `0` - right edge = `1` 5. The **binary representation** (code) of a character is the sequence of edge labels along the path from the root to that character’s leaf. 6. The encoded output `bitstring` is the concatenation of the codes for each character in `s`. ### Determinism / tie-breaking To make the encoding deterministic when multiple nodes have equal weight, use this tie-break rule: - When choosing the two minimum-weight nodes, sort candidates by `(weight, key)`. - For a **leaf**, `key = character` (lexicographic order). - For an **internal node**, `key = the smallest character contained in its subtree`. - When attaching the two chosen nodes as children, the smaller `(weight, key)` becomes the **left** child (gets edge `0`). ### Decoding rules Given `(bitstring, tree)`, decode by traversing the tree from the root: - Read bits left-to-right. - Bit `0` means go to the left child; bit `1` means go to the right child. - When you reach a leaf, output its character and reset to the root. ### Inputs / outputs - Input string `s` consists of arbitrary ASCII characters. - Return the encoded bitstring as a string of `'0'` and `'1'`. ### Edge cases - If `s` is empty, encoding returns `("", null)` and decoding returns `""`. - If `s` contains only one distinct character (e.g., `"aaaa"`), assign that character the code `"0"` (so encoding is `"0000"`). ### Constraints - `0 ≤ |s| ≤ 2 * 10^5` - Number of distinct characters `σ ≤ 256` - Aim for `O(|s| + σ log σ)` time and `O(σ)` extra space (excluding output).

Quick Answer: This question evaluates understanding of frequency-based binary encoding and decoding, including deterministic tree construction, tie-breaking rules, and traversal, and tests knowledge of trees, priority queues, greedy algorithms, and bit-level string manipulation in the Coding & Algorithms domain.

Related Interview Questions

  • Compute Piecewise Linear Interpolation - Two Sigma (medium)
  • Implement an In-Memory Database - Two Sigma (hard)
  • Merge two sorted linked lists - Two Sigma (hard)
  • Merge Two Sorted Lists - Two Sigma (hard)
  • Evaluate Piecewise Linear Function - Two Sigma (medium)
Two Sigma logo
Two Sigma
Dec 15, 2025, 12:00 AM
Software Engineer
Technical Screen
Coding & Algorithms
5
0

Problem

Implement a binary encoding and decoding scheme for strings based on character frequencies (Huffman-like coding).

You must implement two functions:

  1. encode(s) → returns (bitstring, tree)
  2. decode(bitstring, tree) → returns the original string

Encoding rules

Given an input string s:

  1. Count the frequency of each character.
  2. Create a leaf node for each distinct character with weight = its frequency.
  3. Build a binary tree by repeatedly combining the two nodes with the smallest weights into a new parent node:
    • Parent weight = sum of the two child weights.
    • The two chosen nodes become the parent’s left and right children.
  4. Assign edge labels:
    • left edge = 0
    • right edge = 1
  5. The binary representation (code) of a character is the sequence of edge labels along the path from the root to that character’s leaf.
  6. The encoded output bitstring is the concatenation of the codes for each character in s .

Determinism / tie-breaking

To make the encoding deterministic when multiple nodes have equal weight, use this tie-break rule:

  • When choosing the two minimum-weight nodes, sort candidates by (weight, key) .
  • For a leaf , key = character (lexicographic order).
  • For an internal node , key = the smallest character contained in its subtree .
  • When attaching the two chosen nodes as children, the smaller (weight, key) becomes the left child (gets edge 0 ).

Decoding rules

Given (bitstring, tree), decode by traversing the tree from the root:

  • Read bits left-to-right.
  • Bit 0 means go to the left child; bit 1 means go to the right child.
  • When you reach a leaf, output its character and reset to the root.

Inputs / outputs

  • Input string s consists of arbitrary ASCII characters.
  • Return the encoded bitstring as a string of '0' and '1' .

Edge cases

  • If s is empty, encoding returns ("", null) and decoding returns "" .
  • If s contains only one distinct character (e.g., "aaaa" ), assign that character the code "0" (so encoding is "0000" ).

Constraints

  • 0 ≤ |s| ≤ 2 * 10^5
  • Number of distinct characters σ ≤ 256
  • Aim for O(|s| + σ log σ) time and O(σ) extra space (excluding output).

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Two Sigma•More Software Engineer•Two Sigma Software Engineer•Two Sigma Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.