Implement version string comparator with extensions
Company: Nextdoor
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates string parsing and comparison algorithms, handling of numeric versus lexical ordering and versioning semantics (including pre-release identifiers and ignored build metadata), within the Coding & Algorithms domain.
Part 1: Compare Dot-Separated Numeric Version Strings
Constraints
- 0 <= len(a), len(b) <= 200000
- Each string contains only digits `0-9` and `.`
- The empty string is allowed and represents version `0`
- Empty components are allowed and should be treated as `0`
- A numeric component may be much longer than standard integer ranges
Examples
Input: ('1.02.3', '1.2.3.0')
Expected Output:
Explanation: Leading zeros are ignored, and the missing trailing component in the first version is treated as 0.
Input: ('1.0.5', '1.0.12')
Expected Output:
Explanation: Compare numerically: 5 < 12.
Input: ('1..2', '1.0.2.0')
Expected Output:
Explanation: The empty component between the dots is treated as 0.
Input: ('', '0.0')
Expected Output:
Explanation: An empty string represents version 0, which is equal to 0.0.
Input: ('999999999999999999999999.1', '1000000000000000000000000.0')
Expected Output:
Explanation: The first component of the second version has more significant digits, so it is larger.
Solution
def solution(a, b):
def normalize(comp):
if comp == '':
return '0'
i = 0
while i < len(comp) and comp[i] == '0':
i += 1
return comp[i:] if i < len(comp) else '0'
def cmp_num(x, y):
x = normalize(x)
y = normalize(y)
if len(x) < len(y):
return -1
if len(x) > len(y):
return 1
if x < y:
return -1
if x > y:
return 1
return 0
parts_a = a.split('.') if a != '' else ['']
parts_b = b.split('.') if b != '' else ['']
for i in range(max(len(parts_a), len(parts_b))):
comp_a = parts_a[i] if i < len(parts_a) else '0'
comp_b = parts_b[i] if i < len(parts_b) else '0'
result = cmp_num(comp_a, comp_b)
if result != 0:
return result
return 0Time complexity: O(len(a) + len(b)). Space complexity: O(len(a) + len(b)).
Hints
- Compare the versions component by component. If one version runs out of components, use `0` for the missing parts.
- To compare very long numeric components safely, strip leading zeros, then compare by length and finally lexicographically.
Part 2: Compare Version Strings with Pre-Release Identifiers and Build Metadata
Constraints
- 0 <= len(a), len(b) <= 200000
- Core version components contain digits and dots; empty core components should be treated as `0`
- A pre-release part, if present, is dot-separated and may contain letters, digits, or hyphens
- Build metadata starts after the first `+` and must be ignored
- Numeric core components and numeric pre-release tokens may be very long
Examples
Input: ('1.0.0-alpha', '1.0.0')
Expected Output:
Explanation: A pre-release version is smaller than the corresponding final release.
Input: ('1.0.0-alpha.1', '1.0.0-alpha.beta')
Expected Output:
Explanation: After matching 'alpha', numeric token 1 is smaller than non-numeric token 'beta'.
Input: ('1.0.0-rc.1+build.5', '1.0.0-rc.1+build.99')
Expected Output:
Explanation: Build metadata is ignored entirely.
Input: ('01..0-alpha', '1.0.0-alpha.1')
Expected Output:
Explanation: The cores are equal after normalizing leading zeros and empty segments; then 'alpha' is smaller than 'alpha.1' because it is the shorter equal prefix.
Input: ('', '0+build.7')
Expected Output:
Explanation: An empty string represents core version 0, and build metadata does not affect ordering.
Solution
def solution(a, b):
def normalize_num(s):
if s == '':
return '0'
i = 0
while i < len(s) and s[i] == '0':
i += 1
return s[i:] if i < len(s) else '0'
def cmp_num(x, y):
x = normalize_num(x)
y = normalize_num(y)
if len(x) < len(y):
return -1
if len(x) > len(y):
return 1
if x < y:
return -1
if x > y:
return 1
return 0
def parse(version):
main = version.split('+', 1)[0]
if '-' in main:
core, pre = main.split('-', 1)
pre_parts = pre.split('.') if pre != '' else ['']
else:
core = main
pre_parts = None
core_parts = core.split('.') if core != '' else ['']
return core_parts, pre_parts
def cmp_core(parts_a, parts_b):
for i in range(max(len(parts_a), len(parts_b))):
comp_a = parts_a[i] if i < len(parts_a) else '0'
comp_b = parts_b[i] if i < len(parts_b) else '0'
result = cmp_num(comp_a, comp_b)
if result != 0:
return result
return 0
def cmp_pre(pre_a, pre_b):
if pre_a is None and pre_b is None:
return 0
if pre_a is None:
return 1
if pre_b is None:
return -1
limit = min(len(pre_a), len(pre_b))
for i in range(limit):
ta = pre_a[i]
tb = pre_b[i]
is_num_a = ta.isdigit()
is_num_b = tb.isdigit()
if is_num_a and is_num_b:
result = cmp_num(ta, tb)
if result != 0:
return result
elif is_num_a and not is_num_b:
return -1
elif not is_num_a and is_num_b:
return 1
else:
if ta < tb:
return -1
if ta > tb:
return 1
if len(pre_a) < len(pre_b):
return -1
if len(pre_a) > len(pre_b):
return 1
return 0
core_a, pre_a = parse(a)
core_b, pre_b = parse(b)
result = cmp_core(core_a, core_b)
if result != 0:
return result
return cmp_pre(pre_a, pre_b)Time complexity: O(len(a) + len(b)). Space complexity: O(len(a) + len(b)).
Hints
- First strip off build metadata, then separate the core part from the optional pre-release part.
- After the core versions compare equal, remember that a release version beats any pre-release of the same core.