Parse Markdown Links Without Regex
Company: Samsara
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates proficiency in string parsing, tokenization, escape-sequence handling, and input validation required to convert simplified Markdown-style link syntax into HTML anchors without using regular expressions.
Constraints
- 0 <= len(text) <= 200000
- Do not use regular expressions.
- Only exact unescaped patterns of the form `[label]("url")` should be converted; malformed patterns must remain unchanged.
Examples
Input: ('Click [Go to Link]("whateverlink.com") now.',)
Expected Output: 'Click <a href="whateverlink.com">Go to Link</a> now.'
Explanation: A well-formed link is converted and surrounding text is preserved.
Input: ('[A]("u1") and [B]("u2")',)
Expected Output: '<a href="u1">A</a> and <a href="u2">B</a>'
Explanation: The parser must handle multiple valid links in the same string.
Input: ('Broken [label](url) stays',)
Expected Output: 'Broken [label](url) stays'
Explanation: The URL is not wrapped in double quotes, so the pattern is malformed and must stay unchanged.
Input: ('Line 1: [A]("u1")\nLine 2: [B]("u2").',)
Expected Output: 'Line 1: <a href="u1">A</a>\nLine 2: <a href="u2">B</a>.'
Explanation: Links should be parsed across newline characters as well.
Input: ('Ignore \\[not a link]("u") but parse [yes]("ok")',)
Expected Output: 'Ignore \\[not a link]("u") but parse <a href="ok">yes</a>'
Explanation: The escaped `[` is literal, so that pattern is ignored, while the later valid link is converted.
Input: ('[A]("he said \\\"hi\\\"")',)
Expected Output: '<a href="he said \\\"hi\\\"">A</a>'
Explanation: Escaped quotes inside the URL are treated as literal characters, not as the closing quote.
Input: ('[a [b] c]("u")',)
Expected Output: '<a href="u">a [b] c</a>'
Explanation: Nested brackets inside the label are handled with stack matching, so the outer pair becomes the link label.
Input: ('',)
Expected Output: ''
Explanation: An empty input should return an empty output.
Input: ('[A]("u)',)
Expected Output: '[A]("u)'
Explanation: The closing quote and parenthesis are missing, so the malformed pattern remains unchanged.
Input: ('Start [L1]("u1"), [L2]("u2"), [L3]("u3"), [L4]("u4"), [L5]("u5"), [L6]("u6"), [L7]("u7"), [L8]("u8"), [L9]("u9"), [L10]("u10") end.',)
Expected Output: 'Start <a href="u1">L1</a>, <a href="u2">L2</a>, <a href="u3">L3</a>, <a href="u4">L4</a>, <a href="u5">L5</a>, <a href="u6">L6</a>, <a href="u7">L7</a>, <a href="u8">L8</a>, <a href="u9">L9</a>, <a href="u10">L10</a> end.'
Explanation: A longer input with many replacements checks that the parser handles large strings efficiently.
Hints
- Use a stack to track positions of unescaped `[` characters. Each unescaped `]` closes the most recent unmatched `[`.
- To keep the parser linear, precompute which characters are escaped and where the next unescaped `"` appears.