PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Snapchat

Implement XML tokenizer and parser with operations

Last updated: Mar 29, 2026

Quick Overview

This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Implement XML tokenizer and parser with operations states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • Medium
  • Snapchat
  • Coding & Algorithms
  • Software Engineer

Implement XML tokenizer and parser with operations

Company: Snapchat

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

You are given either (a) a raw XML-like string such as <catalog><book><author>Gambardella, Matthew</author></book></catalog> or (b) its tokenized form as a list of dictionaries like [{'text': 'catalog', 'token_type': 'open_tag'}, {'text': 'book', 'token_type': 'open_tag'}, {'text': 'author', 'token_type': 'open_tag'}, {'text': 'Gambardella, Matthew', 'token_type': 'raw_text'}, {'text': 'author', 'token_type': 'close_tag'}, {'text': 'book', 'token_type': 'close_tag'}, {'text': 'catalog', 'token_type': 'close_tag'}]. Implement: 1) tokenize(xml_str) -> list[dict] that emits tokens where token_type ∈ {open_tag, close_tag, raw_text}; 2) class XMLParser with __init__(tokens: list[dict]) that validates the structure and raises an exception for malformed input; 3) to_string()/__str__() that reconstructs the original XML; 4) add_element(path: list[str], tag: str, text: str|null, index: int|null) to insert a new element under the node identified by path; 5) remove_element(path: list[str]) to delete a node; 6) traverse_iterative() that performs an iterative DFS (no recursion) and yields nodes in preorder. Constraints and requirements: use a linear scan with a stack for validation; overall validation should be O (n) time and O (h) space where h is tree height; tags have no attributes and text may contain any characters except '<' and '>'; handle edge cases such as mismatched or out-of-order closing tags, unclosed tags at EOF, empty token lists, and extraneous raw_text between sibling tags. Describe your data structures, algorithms, and time/space complexity for each method.

Quick Answer: This interview question evaluates algorithm design, data structures, correctness, complexity, edge cases, and implementation details in a realistic interview setting. A strong answer for Implement XML tokenizer and parser with operations states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Solution

# Solution Alignment The prompt asks for an implementation-level answer. The safest way to present it is to define the state, maintain clear invariants, then walk through complexity and tests. ## Problem Restatement You are given either (a) a raw XML-like string such as <catalog><book><author>Gambardella, Matthew</author></book></catalog> or (b) its tokenized form as a list of dictionaries like [{'text': 'catalog', 'token_type': 'open_tag'}, {'text': 'book', 'token_type': 'open_tag'}, {'text': 'author', 'token_type': 'open_tag'}, {'text': 'Gambardella, Matthew', 'token_type': 'raw_text'}, {'text': 'author', 'token_type': 'close_tag'}, {'text': 'book', 'token_type': 'close_tag'}, {'text': 'catalog', 'token_type': 'close_tag'}]. Implement: 1) tokenize(xml_str) -> list[dict] that emits tokens where token_type ∈ {open_tag, close_tag, raw_text}; 2) class XMLParser with __init__(tokens: list[dict]) that valid... ## Recommended Approach Choose traversal based on the required output. DFS is natural for subtree computations, reconstruction, and range pruning; BFS is natural for level order or side views. Keep per-depth or per-position state when the output depends on columns, rows, or depths. ## Correctness The implementation should maintain an invariant after each loop or operation that directly matches the problem statement. At termination, that invariant implies the returned value has considered every valid candidate exactly once, or has preserved the required data-structure state after every API call. ## Complexity Most tree traversals are O(n) time and O(h) recursion stack for DFS or O(w) queue space for BFS, where h is height and w is maximum width. ## Edge Cases and Tests Empty tree, one node, skewed tree, duplicate values when reconstruction assumes uniqueness, deep recursion, and tie-breaking for same row/column nodes.

Related Interview Questions

  • Determine Whether Courses Can Be Completed - Snapchat (medium)
  • Solve Decimal Coin Change - Snapchat (medium)
  • Find Maximum Island Perimeter - Snapchat (medium)
  • Solve Three Algorithmic Tasks - Snapchat (hard)
  • Implement a Timestamped Counter - Snapchat (medium)
|Home/Coding & Algorithms/Snapchat

Implement XML tokenizer and parser with operations

Snapchat logo
Snapchat
Aug 10, 2025, 12:00 AM
MediumSoftware EngineerTechnical ScreenCoding & Algorithms
2
0

Implement XML tokenizer and parser with operations

You are given either (a) a raw XML-like string such as <catalog><book><author>Gambardella, Matthew</author></book></catalog> or (b) its tokenized form as a list of dictionaries like [{'text': 'catalog', 'token_type': 'open_tag'}, {'text': 'book', 'token_type': 'open_tag'}, {'text': 'author', 'token_type': 'open_tag'}, {'text': 'Gambardella, Matthew', 'token_type': 'raw_text'}, {'text': 'author', 'token_type': 'close_tag'}, {'text': 'book', 'token_type': 'close_tag'}, {'text': 'catalog', 'token_type': 'close_tag'}]. Implement:

  1. tokenize(xml_str) -> list[dict] that emits tokens where token_type ∈ {open_tag, close_tag, raw_text};
  2. class XMLParser with init (tokens: list[dict]) that validates the structure and raises an exception for malformed input;
  3. to_string()/ str () that reconstructs the original XML;
  4. add_element(path: list[str], tag: str, text: str|null, index: int|null) to insert a new element under the node identified by path;
  5. remove_element(path: list[str]) to delete a node;
  6. traverse_iterative() that performs an iterative DFS (no recursion) and yields nodes in preorder. Constraints and requirements: use a linear scan with a stack for validation; overall validation should be O (n) time and O (h) space where h is tree height; tags have no attributes and text may contain any characters except '<' and '>'; handle edge cases such as mismatched or out-of-order closing tags, unclosed tags at EOF, empty token lists, and extraneous raw_text between sibling tags. Describe your data structures, algorithms, and time/space complexity for each method.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify input sizes, value ranges, mutability, return format, and tie-breaking.
  • State the target time and space complexity before coding.
  • Call out edge cases such as empty inputs, duplicates, invalid values, overflow, and boundary sizes.

What a Strong Answer Covers

  • A clear algorithm with the right data structures and enough pseudocode or code-level detail to implement it.
  • A correctness argument that explains why the algorithm covers all required cases.
  • Time and space complexity, plus at least one alternative approach when relevant.
  • Focused tests for normal cases, edge cases, and failure modes.

Follow-up Questions

  • How would the approach change if the input were streaming or too large for memory?
  • What invariants would you assert in production code?
  • Which tests would catch off-by-one, duplicate, or tie-breaking bugs?

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Snapchat•More Software Engineer•Snapchat Software Engineer•Snapchat Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.