Implement firewall matching with CIDR rules
Company: Databricks
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Onsite
Quick Answer: This question evaluates understanding of IPv4 addressing and CIDR prefix matching, including bitwise reasoning, robust parsing, handling of edge cases, and algorithmic efficiency for large rule sets.
Part 1: Implement isAllowed for IPv4 firewall matching with CIDR rules
Constraints
- 0 <= len(rules) <= 10^5
- Each IPv4 octet must be an integer in [0, 255]
- Each prefix length must be an integer in [0, 32]
- Rule order matters: the first matching rule decides the result
- If any rule or the queried IP is malformed, return False
Examples
Input: ([('ALLOW', '192.168.1.0/24'), ('DENY', '0.0.0.0/0')], '192.168.1.99')
Expected Output: True
Explanation: Both rules match, but the first match is ALLOW, so the IP is allowed.
Input: ([('DENY', '10.0.0.0/8'), ('ALLOW', '10.0.0.0/16')], '10.0.5.6')
Expected Output: False
Explanation: The IP matches both CIDRs, but the earlier DENY rule wins.
Input: ([('ALLOW', '10.0.0.0/24')], '10.0.1.1')
Expected Output: False
Explanation: No rule matches, so the default result is DENY.
Input: ([('ALLOW', '1.2.3.4/32'), ('DENY', '0.0.0.0/0')], '1.2.3.4')
Expected Output: True
Explanation: A /32 rule matches exactly one IP address.
Input: ([('ALLOW', '10.0.0.0/8'), ('DENY', '300.0.0.0/8')], '10.1.2.3')
Expected Output: False
Explanation: Any malformed rule makes the whole input invalid for this problem, so return False.
Solution
def solution(rules, ip_address):
def parse_ip(text):
if not isinstance(text, str):
return None
parts = text.split('.')
if len(parts) != 4:
return None
value = 0
for part in parts:
if not part.isdigit():
return None
num = int(part)
if num < 0 or num > 255:
return None
value = (value << 8) | num
return value
def parse_cidr(text):
if not isinstance(text, str) or text.count('/') != 1:
return None
ip_text, prefix_text = text.split('/')
if not prefix_text.isdigit():
return None
prefix = int(prefix_text)
if prefix < 0 or prefix > 32:
return None
ip_value = parse_ip(ip_text)
if ip_value is None:
return None
return ip_value, prefix
ip_value = parse_ip(ip_address)
if ip_value is None or not isinstance(rules, (list, tuple)):
return False
parsed_rules = []
for rule in rules:
if not isinstance(rule, (list, tuple)) or len(rule) != 2:
return False
action, cidr = rule
if action not in ('ALLOW', 'DENY'):
return False
parsed = parse_cidr(cidr)
if parsed is None:
return False
network, prefix = parsed
parsed_rules.append((action, network, prefix))
for action, network, prefix in parsed_rules:
mask = 0 if prefix == 0 else ((0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF)
if (ip_value & mask) == (network & mask):
return action == 'ALLOW'
return False
Time complexity: O(n), where n is the number of rules. Space complexity: O(n) for the validated rule list.
Hints
- Convert dotted IPv4 strings into a single 32-bit integer so comparisons become bit operations.
- For a prefix length L, build a mask with the top L bits set and compare masked values.
Part 2: Validate coverage of a firewall matcher test suite
Constraints
- 0 <= len(test_suite) <= 10^4
- The total number of rules across all cases is at most 2 * 10^5
- Only valid cases may cover categories other than INVALID_INPUT
- IPv4 and CIDR validation rules are the same as in the firewall problem
Examples
Input: ([([('ALLOW', '10.0.0.0/24')], '10.0.0.5'), ([('ALLOW', '10.0.0.0/24')], '11.0.0.1'), ([('DENY', '10.0.0.0/8'), ('ALLOW', '10.0.0.0/16')], '10.0.1.1'), ([('ALLOW', '192.168.1.0/24')], '192.168.1.255'), ([('ALLOW', '0.0.0.0/0')], '8.8.8.8'), ([('DENY', '1.2.3.4/32')], '1.2.3.4'), ([('ALLOW', '300.1.1.0/24')], '1.1.1.1')],)
Expected Output: []
Explanation: This suite covers every required category, including invalid input.
Input: ([([('ALLOW', '10.0.0.0/24')], '10.0.0.5'), ([('ALLOW', '10.0.0.0/24')], '11.0.0.1'), ([('DENY', '10.0.0.0/8'), ('ALLOW', '10.0.0.0/16')], '10.0.1.1'), ([('ALLOW', '0.0.0.0/0')], '8.8.8.8')],)
Expected Output: ['BOUNDARY', 'EDGE_PREFIX_32', 'INVALID_INPUT']
Explanation: The suite has a simple match, a non-match, precedence, and /0, but no boundary case, no /32 case, and no invalid input case.
Input: ([],)
Expected Output: ['SIMPLE_MATCH', 'SIMPLE_NON_MATCH', 'FIRST_RULE_WINS', 'BOUNDARY', 'EDGE_PREFIX_0', 'EDGE_PREFIX_32', 'INVALID_INPUT']
Explanation: An empty proposed suite covers nothing.
Input: ([([('ALLOW', '10.0.0.0/24')], '999.1.1.1')],)
Expected Output: ['SIMPLE_MATCH', 'SIMPLE_NON_MATCH', 'FIRST_RULE_WINS', 'BOUNDARY', 'EDGE_PREFIX_0', 'EDGE_PREFIX_32']
Explanation: This suite covers only INVALID_INPUT because the queried IP is malformed.
Solution
def solution(test_suite):
required = [
'SIMPLE_MATCH',
'SIMPLE_NON_MATCH',
'FIRST_RULE_WINS',
'BOUNDARY',
'EDGE_PREFIX_0',
'EDGE_PREFIX_32',
'INVALID_INPUT',
]
covered = set()
def parse_ip(text):
if not isinstance(text, str):
return None
parts = text.split('.')
if len(parts) != 4:
return None
value = 0
for part in parts:
if not part.isdigit():
return None
num = int(part)
if num < 0 or num > 255:
return None
value = (value << 8) | num
return value
def parse_cidr(text):
if not isinstance(text, str) or text.count('/') != 1:
return None
ip_text, prefix_text = text.split('/')
if not prefix_text.isdigit():
return None
prefix = int(prefix_text)
if prefix < 0 or prefix > 32:
return None
ip_value = parse_ip(ip_text)
if ip_value is None:
return None
mask = 0 if prefix == 0 else ((0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF)
start = ip_value & mask
end = start | (0xFFFFFFFF ^ mask)
return start, end, prefix
for case in test_suite:
if not isinstance(case, (list, tuple)) or len(case) != 2:
covered.add('INVALID_INPUT')
continue
rules, ip_address = case
if not isinstance(rules, (list, tuple)):
covered.add('INVALID_INPUT')
continue
ip_value = parse_ip(ip_address)
if ip_value is None:
covered.add('INVALID_INPUT')
continue
valid_rules = []
bad_case = False
for rule in rules:
if not isinstance(rule, (list, tuple)) or len(rule) != 2:
bad_case = True
break
action, cidr = rule
if action not in ('ALLOW', 'DENY'):
bad_case = True
break
parsed = parse_cidr(cidr)
if parsed is None:
bad_case = True
break
start, end, prefix = parsed
valid_rules.append((action, start, end, prefix))
if bad_case:
covered.add('INVALID_INPUT')
continue
for _, start, end, prefix in valid_rules:
if prefix == 0:
covered.add('EDGE_PREFIX_0')
if prefix == 32:
covered.add('EDGE_PREFIX_32')
if ip_value == start or ip_value == end:
covered.add('BOUNDARY')
matched_actions = []
for action, start, end, _ in valid_rules:
if start <= ip_value <= end:
matched_actions.append(action)
if len(matched_actions) == 0:
covered.add('SIMPLE_NON_MATCH')
elif len(matched_actions) == 1:
covered.add('SIMPLE_MATCH')
elif len(set(matched_actions)) > 1:
covered.add('FIRST_RULE_WINS')
return [name for name in required if name not in covered]
Time complexity: O(R), where R is the total number of rules across all proposed test cases. Space complexity: O(M) for the largest single case, where M is the number of rules in that case.
Hints
- Reuse the same IP-to-integer and CIDR parsing logic you would use in the firewall matcher itself.
- Track each category with a boolean flag; one valid case can cover several categories at once.
Part 3: Extend firewall matching to CIDR range queries
Constraints
- 0 <= len(rules) <= 5000
- Each action must be 'ALLOW' or 'DENY'
- Each IPv4 octet must be in [0, 255] and each prefix length in [0, 32]
- Rules are evaluated in the given order
- Malformed input should return 'INVALID'
Examples
Input: ([('ALLOW', '10.0.0.0/24')], '10.0.0.0/24')
Expected Output: 'ALLOW'
Explanation: The single rule covers the whole queried block, so every IP is allowed.
Input: ([('ALLOW', '10.0.0.0/25')], '10.0.0.0/24')
Expected Output: 'MIXED'
Explanation: The first half of the /24 is allowed, while the second half is denied by default.
Input: ([('DENY', '10.0.0.0/24'), ('ALLOW', '10.0.0.0/25')], '10.0.0.0/25')
Expected Output: 'DENY'
Explanation: Even though the later ALLOW overlaps, the earlier DENY already matches the whole query block.
Input: ([], '0.0.0.0/0')
Expected Output: 'DENY'
Explanation: With no rules at all, every IP is denied by default.
Input: ([('ALLOW', '10.0.0.0/24')], '10.0.0.0/33')
Expected Output: 'INVALID'
Explanation: A prefix length of 33 is invalid.
Solution
def solution(rules, cidr_range):
def parse_ip(text):
if not isinstance(text, str):
return None
parts = text.split('.')
if len(parts) != 4:
return None
value = 0
for part in parts:
if not part.isdigit():
return None
num = int(part)
if num < 0 or num > 255:
return None
value = (value << 8) | num
return value
def parse_cidr(text):
if not isinstance(text, str) or text.count('/') != 1:
return None
ip_text, prefix_text = text.split('/')
if not prefix_text.isdigit():
return None
prefix = int(prefix_text)
if prefix < 0 or prefix > 32:
return None
ip_value = parse_ip(ip_text)
if ip_value is None:
return None
mask = 0 if prefix == 0 else ((0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF)
start = ip_value & mask
end = start | (0xFFFFFFFF ^ mask)
return start, end, prefix
query = parse_cidr(cidr_range)
if query is None or not isinstance(rules, (list, tuple)):
return 'INVALID'
parsed_rules = []
for rule in rules:
if not isinstance(rule, (list, tuple)) or len(rule) != 2:
return 'INVALID'
action, cidr = rule
if action not in ('ALLOW', 'DENY'):
return 'INVALID'
parsed = parse_cidr(cidr)
if parsed is None:
return 'INVALID'
start, end, _ = parsed
parsed_rules.append((action, start, end))
query_start, query_end, _ = query
unresolved = [(query_start, query_end)]
seen_actions = set()
for action, rule_start, rule_end in parsed_rules:
next_unresolved = []
for seg_start, seg_end in unresolved:
if seg_end < rule_start or rule_end < seg_start:
next_unresolved.append((seg_start, seg_end))
continue
if seg_start < rule_start:
next_unresolved.append((seg_start, rule_start - 1))
seen_actions.add(action)
if len(seen_actions) > 1:
return 'MIXED'
if rule_end < seg_end:
next_unresolved.append((rule_end + 1, seg_end))
unresolved = next_unresolved
if not unresolved:
break
if unresolved:
seen_actions.add('DENY')
if len(seen_actions) > 1:
return 'MIXED'
return next(iter(seen_actions)) if seen_actions else 'DENY'
Time complexity: O(R * S), where R is the number of rules and S is the maximum number of unresolved sub-intervals; worst case O(R^2). Space complexity: O(S + R).
Hints
- Convert each CIDR to an integer interval [start, end].
- Process rules in order while keeping track of only the still-unresolved parts of the query range.
Part 4: Implement the exact IPv4-to-CIDR matching condition
Constraints
- Each IPv4 octet must be an integer in [0, 255]
- The prefix length must be an integer in [0, 32]
- Return False on malformed input
- You should use bit operations rather than string-by-string binary comparison
Examples
Input: ('192.168.1.10', '192.168.1.0/24')
Expected Output: True
Explanation: The first 24 bits match.
Input: ('192.168.2.10', '192.168.1.0/24')
Expected Output: False
Explanation: The third octet differs, so the first 24 bits do not match.
Input: ('8.8.8.8', '0.0.0.0/0')
Expected Output: True
Explanation: A /0 CIDR matches every IPv4 address.
Input: ('1.2.3.4', '1.2.3.4/32')
Expected Output: True
Explanation: A /32 CIDR matches exactly one IP.
Input: ('256.1.1.1', '1.2.3.0/24')
Expected Output: False
Explanation: The IP is malformed because 256 is outside the valid IPv4 octet range.
Solution
def solution(ip_address, cidr):
def parse_ip(text):
if not isinstance(text, str):
return None
parts = text.split('.')
if len(parts) != 4:
return None
value = 0
for part in parts:
if not part.isdigit():
return None
num = int(part)
if num < 0 or num > 255:
return None
value = (value << 8) | num
return value
def parse_cidr(text):
if not isinstance(text, str) or text.count('/') != 1:
return None
ip_text, prefix_text = text.split('/')
if not prefix_text.isdigit():
return None
prefix = int(prefix_text)
if prefix < 0 or prefix > 32:
return None
ip_value = parse_ip(ip_text)
if ip_value is None:
return None
return ip_value, prefix
ip_value = parse_ip(ip_address)
parsed = parse_cidr(cidr)
if ip_value is None or parsed is None:
return False
network, prefix = parsed
mask = 0 if prefix == 0 else ((0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF)
return (ip_value & mask) == (network & mask)
Time complexity: O(1). Space complexity: O(1).
Hints
- A prefix length of 0 means every IP matches.
- Use a 32-bit mask with the top L bits set, then compare masked values.