PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of tree and graph concepts, strict string parsing and validation, duplicate and cycle detection, and the construction of canonical S-expressions within the Coding & Algorithms category.

  • medium
  • Optiver
  • Coding & Algorithms
  • Software Engineer

Build and Validate a Binary Tree from Parent-Child Pairs

Company: Optiver

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Take-home Project

## Build and Validate a Binary Tree from Parent-Child Pairs You are given a single input string that is supposed to describe a binary tree as a sequence of parent-child pairs, for example: ``` (A,B) (B,C) (A,D) ``` Each pair has the form `(P,C)`, meaning node `P` is the parent of node `C`. Node values are single uppercase English letters `A`–`Z`. Your task is to validate the input and, if it describes a valid binary tree, print the tree's S-expression. If the input is invalid, print the highest-priority error code instead. ### Input format (strict) A *well-formed* input string consists of one or more pairs, where: - Each pair is exactly `(` + one uppercase letter + `,` + one uppercase letter + `)` — no spaces inside a pair. - Consecutive pairs are separated by exactly one space. - There is no leading or trailing whitespace, and no other characters appear anywhere. The input may be **any** arbitrary string, so it must be checked carefully against this format. Any deviation from the format above (wrong characters, lowercase letters, multi-character node names, missing parentheses or commas, extra/missing spaces, empty string, etc.) is a format error. ### Errors After format validation, check the described graph. Report exactly one error code — if several errors are present simultaneously, report the one with the highest priority. `E1` is the highest priority and `E5` is the lowest: - **E1 — Invalid Input String**: the input string does not match the strict format described above. - **E2 — Duplicate Pair**: the exact same pair `(P,C)` appears more than once. - **E3 — Parent Has More than 2 Children**: some node appears as the parent in more than two (distinct) pairs. - **E4 — Multiple Roots**: more than one node has no parent. - **E5 — Cycle In The Tree**: the edges do not form a valid tree because some node can be reached in more than one way — there is a directed cycle (including a self-pair such as `(A,A)`), or some node has more than one parent. If any error applies, output only its code, e.g.: ``` E3 ``` ### Output for valid input If the input passes all checks, it describes a single binary tree. Print its S-expression, defined recursively as: ``` S(node) = "(" + node value + S(first child) + S(second child) + ")" ``` where a missing child contributes the empty string, and the children of each node are ordered so that the smaller letter (lexicographically) comes first. There are no spaces in the output. ### Examples **Example 1** ``` Input: (A,B) (B,C) (A,D) Output: (A(B(C))(D)) ``` `A` is the root with children `B` and `D` (`B` printed first); `B` has one child `C`. **Example 2** ``` Input: (A,B) (A,C) (B,D) (D,C) Output: E5 ``` `C` has two parents (`A` and `D`), so the structure is not a valid tree. **Example 3** ``` Input: (A,B) (A,C) (B,D) (A,E) Output: E3 ``` `A` has three children. **Example 4** ``` Input: (A,B) (b,C) Output: E1 ``` Lowercase `b` violates the input format. **Example 5** ``` Input: (A,B) (A,B) (B,C) Output: E2 ``` The pair `(A,B)` appears twice. ### Constraints - The input string length is at most $10^4$ characters. - Node values are limited to the 26 uppercase letters, so a valid tree has at most 26 nodes. - Your solution must return the error checks in strict priority order: `E1 > E2 > E3 > E4 > E5`.

Quick Answer: This question evaluates understanding of tree and graph concepts, strict string parsing and validation, duplicate and cycle detection, and the construction of canonical S-expressions within the Coding & Algorithms category.

You are given a single input string that is supposed to describe a binary tree as a sequence of parent-child pairs, for example `(A,B) (B,C) (A,D)`. Each pair `(P,C)` means node `P` is the parent of node `C`; node values are single uppercase letters `A`-`Z`. Return the tree's S-expression if the input is a valid binary tree, otherwise return the highest-priority error code. Strict input format: one or more pairs, each exactly `(` + uppercase letter + `,` + uppercase letter + `)`, consecutive pairs separated by exactly one space, no leading/trailing whitespace, and no other characters anywhere. Any deviation is a format error. Report exactly one error code, in strict priority order E1 > E2 > E3 > E4 > E5: - E1 - Invalid Input String: the input does not match the strict format. - E2 - Duplicate Pair: the exact same pair `(P,C)` appears more than once. - E3 - Parent Has More than 2 Children: some node is the parent in more than two distinct pairs. - E4 - Multiple Roots: more than one node has no parent. - E5 - Cycle In The Tree: some node has more than one parent, or the edges contain a directed cycle (including a self-pair such as `(A,A)`). For valid input, return the S-expression S(node) = "(" + node value + S(first child) + S(second child) + ")", where a missing child contributes the empty string and children are ordered so the smaller letter comes first. There are no spaces in the output. Example: `(A,B) (B,C) (A,D)` -> `(A(B(C))(D))`.

Constraints

  • The input string length is at most 10^4 characters.
  • Node values are limited to the 26 uppercase letters A-Z, so a valid tree has at most 26 nodes.
  • The input may be any arbitrary string and must be validated strictly against the format.
  • Exactly one output is produced: either the S-expression or a single error code.
  • Errors are reported in strict priority order E1 > E2 > E3 > E4 > E5.

Examples

Input: ('(A,B) (B,C) (A,D)',)

Expected Output: '(A(B(C))(D))'

Explanation: Valid tree: root A has children B and D (B first, smaller letter); B has child C.

Input: ('(A,B) (A,C) (B,D) (D,C)',)

Expected Output: 'E5'

Explanation: C has two parents (A and D), so the graph is not a valid tree.

Input: ('(A,B) (A,C) (B,D) (A,E)',)

Expected Output: 'E3'

Explanation: A appears as parent of B, C and E - more than two children.

Input: ('(A,B) (b,C)',)

Expected Output: 'E1'

Explanation: Lowercase 'b' violates the strict input format.

Input: ('(A,B) (A,B) (B,C)',)

Expected Output: 'E2'

Explanation: The pair (A,B) appears twice - a duplicate pair.

Input: ('(A,B) (C,D)',)

Expected Output: 'E4'

Explanation: Two disjoint trees: both A and C have no parent, so there are multiple roots.

Input: ('(A,A)',)

Expected Output: 'E5'

Explanation: A self-pair (A,A) is a directed cycle.

Input: ('(A,B) (B,A)',)

Expected Output: 'E5'

Explanation: A->B->A is a directed cycle; no node has in-degree 0.

Input: ('(A,B)',)

Expected Output: '(A(B))'

Explanation: Single edge: root A with one child B.

Input: ('',)

Expected Output: 'E1'

Explanation: Empty string has no pairs - invalid format.

Input: ('(A,B) (B,C) (C,D) (D,E) (E,B)',)

Expected Output: 'E5'

Explanation: B has parents A and E - more than one parent (and a cycle B->C->D->E->B).

Input: ('(A,B) (A,C)',)

Expected Output: '(A(B)(C))'

Explanation: Root A has exactly two children B and C, printed in lexicographic order.

Hints

  1. Validate the whole string first with a strict pattern (one or more `(X,Y)` pairs separated by single spaces) before parsing anything - any mismatch is E1.
  2. Process the checks strictly in priority order and return on the first that applies: duplicate pair (E2), then out-degree > 2 (E3), then count of parent-less nodes (E4), then multi-parent or cycle (E5).
  3. A node has no parent when its in-degree is 0. More than one such node means multiple roots (E4); zero such nodes means every node has an incoming edge, which forces a cycle (E5).
  4. With exactly one root and every in-degree <= 1, a directed cycle would leave some nodes unreachable from the root - traverse from the root and compare the visited count to the total node count.
  5. To print the S-expression, sort each node's children so the smaller letter is emitted first, then recurse: `"(" + value + build(child1) + build(child2) + ")"`.
Last updated: Jul 2, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Days Between Two Calendar Dates - Optiver (medium)
  • Thread-Safe Stock Inventory: Buy and Sell Without Overselling - Optiver (medium)
  • Find missing numbers in sequences - Optiver (hard)
  • Design a circular queue data structure - Optiver (medium)
  • Optimize flight and cargo bookings for profit - Optiver (hard)