Implement string-based candlestick classifier
Company: Robinhood
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Technical Screen
Quick Answer: This question evaluates string parsing, streaming input processing, numeric validation, and attention to time/space complexity in the Coding & Algorithms category, focusing on efficient single-pass parsing and O(1) extra space constraints.
Constraints
- 0 <= N <= 200000
- Let L be the total length of the input string; the solution should run in O(L) time
- Use O(1) extra space beyond the output string
- Do not convert the whole input into arrays of records or fields
- Each numeric token, if valid, has at most 30 characters including sign and decimal point
Examples
Input: ("10,12,9,11;8,9,5,6;5,6,4,5",)
Expected Output: "BSD"
Explanation: Record 1 is bullish, record 2 is bearish, and record 3 is a doji. All three records are valid.
Input: ("1,3,0,3;1,a,0,1;4,5,3,2;7,8,,7",)
Expected Output: "BXXX"
Explanation: The first candle is valid and bullish. The second contains a non-numeric field, the third has low > min(open, close), and the fourth is missing a field.
Input: ("-1.5,-1.0,-2.0,-1.5;+.5,1.0,0.5,1.0;3.,4.,2.,1.",)
Expected Output: "DBX"
Explanation: The first is a valid doji, the second is a valid bullish candle, and the third is invalid because low = 2 is greater than min(open, close) = 1.
Input: ("",)
Expected Output: ""
Explanation: An empty input encodes zero records, so the output is an empty string.
Input: ("1,2,0,1;;2,3,1,2;",)
Expected Output: "DXDX"
Explanation: The first record is a valid doji. The empty segment between `;;` is an invalid record. The third record is a valid doji. The trailing semicolon creates one more empty invalid record.
Solution
def solution(s):
if s == "":
return ""
out = []
def cmp_num(a, b):
va, sa = a
vb, sb = b
if sa == sb:
return (va > vb) - (va < vb)
if sa < sb:
va *= 10 ** (sb - sa)
else:
vb *= 10 ** (sa - sb)
return (va > vb) - (va < vb)
fields = 0
record_invalid = False
o = h = l = c = None
sign = 1
value = 0
frac = 0
digits = 0
dot = False
field_invalid = False
sign_allowed = True
def reset_field():
nonlocal sign, value, frac, digits, dot, field_invalid, sign_allowed
sign = 1
value = 0
frac = 0
digits = 0
dot = False
field_invalid = False
sign_allowed = True
def finalize_field():
nonlocal fields, record_invalid, o, h, l, c
if field_invalid or digits == 0:
num = None
record_invalid = True
else:
num = (sign * value, frac)
if fields == 0:
o = num
elif fields == 1:
h = num
elif fields == 2:
l = num
elif fields == 3:
c = num
else:
record_invalid = True
fields += 1
reset_field()
n = len(s)
for i in range(n + 1):
ch = s[i] if i < n else ';'
if ch == ',' or ch == ';':
finalize_field()
if ch == ';':
if fields != 4:
record_invalid = True
if record_invalid or o is None or h is None or l is None or c is None:
out.append('X')
else:
mx = o if cmp_num(o, c) >= 0 else c
mn = o if cmp_num(o, c) <= 0 else c
if cmp_num(h, mx) < 0 or cmp_num(l, mn) > 0:
out.append('X')
else:
oc = cmp_num(c, o)
if oc > 0:
out.append('B')
elif oc < 0:
out.append('S')
else:
out.append('D')
fields = 0
record_invalid = False
o = h = l = c = None
reset_field()
else:
if field_invalid:
continue
if ch == '+' or ch == '-':
if sign_allowed:
sign = -1 if ch == '-' else 1
sign_allowed = False
else:
field_invalid = True
elif ch == '.':
if not dot:
dot = True
sign_allowed = False
else:
field_invalid = True
elif '0' <= ch <= '9':
value = value * 10 + (ord(ch) - 48)
digits += 1
if dot:
frac += 1
sign_allowed = False
else:
field_invalid = True
return ''.join(out)
Time complexity: O(L), where L is the total length of the input string. Space complexity: O(1) extra space beyond the output string.
Hints
- Think of the parser as a small state machine: current record field count, current number state, and current record validity.
- To avoid floating-point issues, parse each number exactly as `(signed_integer_without_decimal_point, digits_after_decimal)` and compare by aligning scales.