PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question tests practical knowledge of parsing theory, including lexical analysis and recursive-descent parsing applied to a well-defined grammar. It evaluates a software engineer's ability to implement low-level string processing, handle edge cases in formal grammars, and produce robust error handling — skills commonly assessed in coding interviews at the senior level.

  • hard
  • Snowflake
  • Coding & Algorithms
  • Software Engineer

Implement a JSON Parser

Company: Snowflake

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

## Implement a JSON Parser Write a parser that takes a single string containing one JSON value and returns the corresponding in-memory data structure. You must implement the parsing yourself — do **not** call a built-in JSON library (`json.loads`, `JSON.parse`, `Gson`, etc.). Your parser must support the full JSON grammar: - **Objects** — `{ "key": value, ... }`, where keys are strings and values are any JSON value. The empty object `{}` is valid. - **Arrays** — `[ value, value, ... ]`, where each element is any JSON value. The empty array `[]` is valid. - **Strings** — double-quoted, with escape sequences: `\"`, `\\`, `\/`, `\b`, `\f`, `\n`, `\r`, `\t`, and 4-hex-digit Unicode escapes `\uXXXX`. - **Numbers** — integers and floating point, including an optional leading minus sign, a fractional part, and an exponent (e.g. `0`, `-12`, `3.14`, `1e10`, `-2.5E-3`). JSON does not allow leading zeros (e.g. `01`) or a leading `+`. - **Literals** — `true`, `false`, and `null`. - **Whitespace** — spaces, tabs, newlines, and carriage returns may appear between any tokens and must be ignored. Map the parsed result to the host language's natural types (for example, in Python: object → `dict`, array → `list`, string → `str`, number → `int`/`float`, `true`/`false` → `bool`, `null` → `None`). ### Validation The input is **not** guaranteed to be valid JSON. If the input is malformed, your parser must reject it by raising/throwing an error (rather than returning a partial or incorrect result). Cases that must be rejected include, but are not limited to: - Trailing commas: `[1, 2, ]` or `{"a": 1,}`. - Missing commas, colons, or brackets: `{"a" 1}`, `[1 2]`, `{"a": 1`. - Unquoted object keys: `{a: 1}`. - Unterminated strings or invalid escape sequences. - Numbers with leading zeros (`012`), a leading `+`, or a bare `.` / trailing `.` (`.5`, `5.`). - Extra non-whitespace characters after a complete, valid JSON value: `{"a": 1} garbage`. - Empty input or input that is only whitespace. ### Examples Input: ``` {"name": "Ada", "age": 36, "langs": ["Pascal", "Ada"], "retired": false, "spouse": null} ``` Output (as a structured value): ``` { name: "Ada", age: 36, langs: ["Pascal", "Ada"], retired: false, spouse: null } ``` Input: ``` [1, -2.5, 3e2, true, null] ``` Output: ``` [1, -2.5, 300.0, true, null] ``` Input: ``` {"a": 1,} ``` Output: an error (trailing comma is invalid). Input: ``` [1 2] ``` Output: an error (missing comma between elements). ### Constraints - The input is a single string of up to roughly $10^5$ characters. - Nesting (objects/arrays inside objects/arrays) may be up to a few hundred levels deep; aim for $O(n)$ time in the input length. - A clean, idiomatic implementation is expected: a tokenizer/scanner plus a recursive-descent (or equivalent) parser, with clear handling of each value type and explicit error reporting on malformed input.

Quick Answer: This question tests practical knowledge of parsing theory, including lexical analysis and recursive-descent parsing applied to a well-defined grammar. It evaluates a software engineer's ability to implement low-level string processing, handle edge cases in formal grammars, and produce robust error handling — skills commonly assessed in coding interviews at the senior level.

Write a function that takes a single string containing one JSON value and returns the corresponding in-memory data structure, **without** calling any built-in JSON library (`json.loads`, `JSON.parse`, `Gson`, etc.). Implement a tokenizer plus a recursive-descent parser. Your parser must support the full JSON grammar: - **Objects** — `{ "key": value, ... }` (keys are strings; `{}` is valid). - **Arrays** — `[ value, value, ... ]` (`[]` is valid). - **Strings** — double-quoted, with escapes `\"`, `\\`, `\/`, `\b`, `\f`, `\n`, `\r`, `\t`, and 4-hex-digit `\uXXXX`. - **Numbers** — optional leading `-`, integer part (no leading zeros, no leading `+`), optional fraction, optional exponent (`0`, `-12`, `3.14`, `1e10`, `-2.5E-3`). - **Literals** — `true`, `false`, `null`. - **Whitespace** — spaces, tabs, newlines, carriage returns between tokens are ignored. Map to natural types (object→map/dict, array→list, string→str, number→int/float, `true`/`false`→bool, `null`→null/None). ### Validation Input is **not** guaranteed valid. On malformed input the parser must **not** return a partial result. **For this console, signal an error by returning the sentinel string `"<<PARSE_ERROR>>"`** (instead of throwing), so the result is directly comparable. Internally you should still detect the error precisely — e.g. raise an exception in your parser and convert it to the sentinel at the top level. Cases that must be rejected (→ `"<<PARSE_ERROR>>"`) include: trailing commas (`[1, 2, ]`, `{"a": 1,}`); missing commas/colons/brackets (`{"a" 1}`, `[1 2]`, `{"a": 1`); unquoted keys (`{a: 1}`); unterminated strings or invalid escapes; numbers with leading zeros (`012`), a leading `+`, or a bare/trailing dot (`.5`, `5.`); extra non-whitespace after a complete value (`{"a": 1} garbage`); and empty / whitespace-only input. ### Examples - `solution('[1, -2.5, 3e2, true, null]')` → `[1, -2.5, 300.0, True, None]` - `solution('{"a": 1,}')` → `'<<PARSE_ERROR>>'` (trailing comma) - `solution('[1 2]')` → `'<<PARSE_ERROR>>'` (missing comma) ### Constraints - Input is a single string up to ~10^5 characters. - Nesting may be a few hundred levels deep; aim for O(n) time in the input length.

Constraints

  • Input is a single string up to ~10^5 characters.
  • Nesting may be a few hundred levels deep; aim for O(n) time.
  • Do not call a built-in JSON library (json.loads / JSON.parse / Gson, etc.).
  • On malformed input, return the sentinel string "<<PARSE_ERROR>>" rather than a partial result.
  • Numbers reject leading zeros, a leading '+', and a bare/trailing dot; JSON allows only a leading '-'.

Examples

Input: ('{"name": "Ada", "age": 36, "langs": ["Pascal", "Ada"], "retired": false, "spouse": null}',)

Expected Output: {'name': 'Ada', 'age': 36, 'langs': ['Pascal', 'Ada'], 'retired': False, 'spouse': None}

Explanation: A full object with string, int, nested array, bool, and null values parses to the natural Python types.

Input: ('[1, -2.5, 3e2, true, null]',)

Expected Output: [1, -2.5, 300.0, True, None]

Explanation: Mixed array: integer stays int, fraction and exponent become floats (3e2 → 300.0), literals map to True/None.

Input: ('{"a": 1,}',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Trailing comma after the last object member is invalid JSON, so the parser returns the error sentinel.

Input: ('[1 2]',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Missing comma between array elements — after parsing 1 the next non-space char is neither ',' nor ']'.

Input: (' {} ',)

Expected Output: {}

Explanation: Surrounding whitespace is ignored and the empty object is valid.

Input: ('[]',)

Expected Output: []

Explanation: The empty array is valid.

Input: ('"a\\nb\\t\\u0041"',)

Expected Output: 'a\nb\tA'

Explanation: String escapes: \n and \t become control characters and \u0041 decodes to 'A'.

Input: ('012',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Leading zeros are not allowed in JSON numbers; '012' is rejected.

Input: ('5.',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: A trailing dot with no fractional digits is invalid.

Input: (' ',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Whitespace-only (or empty) input contains no JSON value and is rejected.

Input: ('{"a": 1} garbage',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Extra non-whitespace characters after a complete value make the input malformed.

Input: ('-0.5E-3',)

Expected Output: -0.0005

Explanation: Negative number with a fraction and a negative exponent parses to a float.

Input: ('{"nested": {"x": [true, {"y": null}], "z": -7}}',)

Expected Output: {'nested': {'x': [True, {'y': None}], 'z': -7}}

Explanation: Deeply nested objects and arrays parse recursively into nested dicts/lists.

Input: ('{a: 1}',)

Expected Output: '<<PARSE_ERROR>>'

Explanation: Object keys must be double-quoted strings; an unquoted key is rejected.

Hints

  1. Use a single position cursor over the string and a set of mutually-recursive helpers: parse_value, parse_object, parse_array, parse_string, parse_number. Skip whitespace before each value and around structural characters.
  2. parse_value dispatches on the first non-whitespace character: '{' → object, '[' → array, '"' → string, '-' or a digit → number, and the keywords true/false/null. Anything else is an error.
  3. For numbers, validate the grammar yourself: optional '-', then either a single '0' or a non-zero digit followed by digits (this rejects leading zeros like 012), then an optional '.' with at least one digit, then an optional exponent. Reject a trailing dot like '5.'.
  4. After parsing the top-level value, skip trailing whitespace and require the cursor to be at the end of the string — otherwise input like '{"a": 1} garbage' is malformed.
  5. Detect errors precisely by raising an internal exception wherever the grammar is violated, then catch it at the top level and return "<<PARSE_ERROR>>". Also catch RecursionError for pathologically deep input.
Last updated: Jun 24, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Solve Matrix and Array Distance Problems - Snowflake (medium)
  • Solve Array Distance and Wiki Navigation - Snowflake (medium)
  • Implement Document Predicate APIs - Snowflake (medium)
  • Find Shortest Wiki Click Path - Snowflake (medium)
  • Compute Inherited Role Privileges - Snowflake (hard)