PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of IP addressing and CIDR notation, bit-level conversion between prefixes and integer ranges, and the design of scalable data structures and algorithms for prefix/range queries, rule priority handling, and rule-set operations (allow/deny semantics).

  • Medium
  • Databricks
  • Coding & Algorithms
  • Software Engineer

Design an IP filter using CIDR rules

Company: Databricks

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Explain CIDR notation with a couple of concrete examples. Show how to convert a prefix like 192.168.0.0/16 into an inclusive 32-bit integer range and a subnet mask. Then, design a data structure for a firewall rule set: each rule is (prefix: CIDR string, action: 'allow' | 'deny', priority: integer). Implement addRule(rule), removeRule(ruleId), and query(ipAddress) -> action. Specify the matching semantics when prefixes overlap (support both first-match-by-priority and longest-prefix-match), target time/space complexity, and how you would handle up to 10^6 rules and 10^5 queries per second. Follow-up: the query is now a CIDR block instead of a single IP. Add operations overlap(queryPrefix) -> boolean, coveredByAllow(queryPrefix) -> boolean, coveredByDeny(queryPrefix) -> boolean, and listConflictingRules(queryPrefix) -> [ruleIds]. Which data structures would you choose to support efficient range and overlap queries, and how would your design extend to IPv6?

Quick Answer: This question evaluates understanding of IP addressing and CIDR notation, bit-level conversion between prefixes and integer ranges, and the design of scalable data structures and algorithms for prefix/range queries, rule priority handling, and rule-set operations (allow/deny semantics).

Part 1: Dynamic IPv4 firewall with CIDR matching

IPv4 CIDR notation a.b.c.d/p means the first p bits are fixed and the remaining 32-p bits may vary. For example, 192.168.0.0/16 has subnet mask 255.255.0.0 and covers every address from 192.168.0.0 to 192.168.255.255. Build a firewall that processes command strings over active rules. Each ADD command creates a rule (prefix, action, priority) and receives the next rule id starting from 1. Commands: - ADD <cidr> <action> <priority> - REMOVE <ruleId> - QUERY <ipAddress> The matching mode is given separately: - priority: among all matching active rules, choose the one with the smallest priority value; ties go to the smaller rule id. - lpm: longest-prefix-match; if several active rules have that same longest prefix length, choose the one with the smallest priority, then smaller rule id. Return the action for each QUERY command in order. If no active rule matches, return "none".

Constraints

  • 1 <= len(commands) <= 200000
  • 0 <= priority <= 10^9
  • All IP addresses are valid IPv4 addresses and all CIDR prefixes satisfy 0 <= p <= 32
  • REMOVE on a non-existent or already removed rule does nothing

Examples

Input: ("priority", ["ADD 10.0.0.0/8 allow 5", "ADD 10.1.0.0/16 deny 10", "QUERY 10.1.2.3", "ADD 10.1.2.0/24 deny 1", "QUERY 10.1.2.3", "REMOVE 3", "QUERY 10.1.2.3"])

Expected Output: ["allow", "deny", "allow"]

Explanation: In priority mode, the smallest priority wins among all matching prefixes. After removing rule 3, the /8 allow rule wins again.

Input: ("lpm", ["ADD 0.0.0.0/0 deny 100", "ADD 192.168.1.0/24 allow 50", "ADD 192.168.1.128/25 deny 1", "QUERY 192.168.1.10", "QUERY 192.168.1.200", "QUERY 8.8.8.8"])

Expected Output: ["allow", "deny", "deny"]

Explanation: In longest-prefix-match mode, the deepest matching prefix wins before priority is considered.

Input: ("priority", ["QUERY 1.1.1.1", "REMOVE 1", "ADD 255.255.255.255/32 allow 7", "QUERY 255.255.255.255", "QUERY 255.255.255.254"])

Expected Output: ["none", "allow", "none"]

Explanation: Edge case: querying with no active rules returns none. A /32 rule matches exactly one address.

Input: ("priority", ["ADD 1.1.1.0/24 deny 3", "ADD 1.1.1.0/24 allow 3", "QUERY 1.1.1.4", "REMOVE 1", "QUERY 1.1.1.4"])

Expected Output: ["deny", "allow"]

Explanation: When both priority and prefix length tie, the smaller rule id wins.

Solution

def solution(mode, commands):
    import heapq

    def ip_to_int(ip):
        a, b, c, d = map(int, ip.split('.'))
        return (a << 24) | (b << 16) | (c << 8) | d

    def cidr_to_network(cidr):
        ip, plen = cidr.split('/')
        plen = int(plen)
        x = ip_to_int(ip)
        if plen == 0:
            start = 0
        else:
            mask = (0xFFFFFFFF << (32 - plen)) & 0xFFFFFFFF
            start = x & mask
        return start, plen

    class Node:
        __slots__ = ('child', 'heap')

        def __init__(self):
            self.child = [None, None]
            self.heap = []

    root = Node()
    active = {}
    next_id = 1

    def top_active(node):
        h = node.heap
        while h and h[0][1] not in active:
            heapq.heappop(h)
        return h[0] if h else None

    answers = []

    for cmd in commands:
        parts = cmd.split()
        op = parts[0]

        if op == 'ADD':
            cidr = parts[1]
            action = parts[2]
            priority = int(parts[3])
            start, plen = cidr_to_network(cidr)

            node = root
            for i in range(plen):
                bit = (start >> (31 - i)) & 1
                if node.child[bit] is None:
                    node.child[bit] = Node()
                node = node.child[bit]

            heapq.heappush(node.heap, (priority, next_id, action))
            active[next_id] = (node, plen, priority, action)
            next_id += 1

        elif op == 'REMOVE':
            rid = int(parts[1])
            active.pop(rid, None)

        elif op == 'QUERY':
            ip = ip_to_int(parts[1])
            node = root

            if mode == 'priority':
                best = top_active(node)
            else:
                cand = top_active(node)
                best = (0, cand) if cand is not None else None

            depth = 0
            while depth < 32:
                bit = (ip >> (31 - depth)) & 1
                nxt = node.child[bit]
                if nxt is None:
                    break
                node = nxt
                depth += 1
                cand = top_active(node)
                if cand is None:
                    continue

                if mode == 'priority':
                    if best is None or (cand[0], cand[1]) < (best[0], best[1]):
                        best = cand
                else:
                    best = (depth, cand)

            if mode == 'priority':
                answers.append(best[2] if best is not None else 'none')
            else:
                answers.append(best[1][2] if best is not None else 'none')

    return answers
Explanation
The solution stores rules in a **binary trie keyed on IP bits, MSB first**. A rule with prefix length `p` is inserted by walking `p` bits down from the root, creating nodes as needed; the node at depth `p` therefore represents exactly that CIDR block. Each node carries a lazy min-heap of `(priority, ruleId, action)` for the rules ending there, plus a global `active` dict of live rule ids. `/0` rules live at the root. **Key idea:** the rules that match a query IP are precisely those stored on the root-to-leaf path traced by that IP's bits. So a QUERY walks the trie following each successive bit of the IP and inspects the heap at every node passed. **Lazy deletion:** REMOVE only erases the id from `active`. `top_active(node)` pops any heap entries whose id is no longer active until the top is live (or the heap empties), so stale rules cost nothing until they surface. **Mode handling:** - `priority`: keep the globally best `(priority, ruleId)` seen on the path — smaller priority wins, ties broken by smaller id (the tuple order does this automatically). - `lpm`: store `best = (depth, candidate)`; because the walk visits nodes by increasing depth, simply overwriting at each matching node guarantees the **longest** prefix wins, and within one node the heap top already yields smallest priority then smallest id. If no node on the path has an active rule, it returns `"none"`. Because IPv4 has 32 bits, every QUERY touches at most 33 nodes, making each operation effectively `O(log H)` where `H` is the heap size at touched nodes — correct, fast, and order-preserving.

Time complexity: ADD: O(32 + log H), REMOVE: O(1), QUERY: O(32 + 32 log H), where H is the heap size at visited trie nodes. Since IPv4 prefixes are capped at 32 bits, this is effectively O(log H) per operation with a small constant; lazy heap pops are amortized O(log H).. Space complexity: O(T + R), where T is the number of trie nodes created (bounded by 32 times the number of distinct CIDRs) and R is the number of rules retained in node heaps (including not-yet-purged lazily removed ones)..

Hints

  1. A binary trie over the 32 bits of an IPv4 address lets you inspect every matching prefix on one root-to-leaf path.
  2. For removals, lazy deletion works well: mark a rule inactive, and discard it later when it reaches the top of a heap.

Part 2: CIDR block overlap and coverage queries

Convert each IPv4 CIDR block into an inclusive integer interval [start, end]. For example, 192.168.0.0/16 becomes [3232235520, 3232301055]. Maintain a dynamic set of active firewall rules, where each ADD command creates a new rule id starting from 1 and each rule has only a prefix and an action. Commands: - ADD <cidr> <action> - REMOVE <ruleId> - OVERLAP <cidr> - COVERED_ALLOW <cidr> - COVERED_DENY <cidr> - CONFLICTS <cidr> Definitions: - OVERLAP: does the query block intersect at least one active rule? - COVERED_ALLOW: is every IP in the query block covered by the union of active allow rules? - COVERED_DENY: same for active deny rules. - CONFLICTS: look at active rules whose blocks overlap the query block. If both allow and deny appear among those overlapping rules, return all such overlapping rule ids in increasing order; otherwise return an empty list. These operations ignore priority and matching order; they are purely interval-based. Return one string per query command in order. Use "YES" or "NO" for boolean queries. For CONFLICTS, return the Python-list string form, such as "[]" or "[1, 3]". Implement IPv4 only. The same interval idea extends to IPv6 by replacing 32-bit integers with 128-bit integers.

Constraints

  • 1 <= len(commands) <= 20000
  • All addresses are valid IPv4 CIDR blocks with 0 <= p <= 32
  • REMOVE on a non-existent or already removed rule does nothing
  • The intended solution supports fast overlap and coverage queries; CONFLICTS may spend O(A) time scanning active rules to enumerate matching ids

Examples

Input: ["ADD 10.0.0.0/25 allow", "ADD 10.0.0.128/25 allow", "COVERED_ALLOW 10.0.0.0/24", "OVERLAP 10.0.1.0/24", "ADD 10.0.0.64/26 deny", "CONFLICTS 10.0.0.0/24", "COVERED_DENY 10.0.0.64/26", "REMOVE 3", "CONFLICTS 10.0.0.0/24"]

Expected Output: ["YES", "NO", "[1, 2, 3]", "YES", "[]"]

Explanation: Two adjacent /25 allow rules together cover the full /24. After adding a deny block inside that range, the overlapping rules on the /24 contain both actions, so they conflict.

Input: ["ADD 0.0.0.0/1 deny", "ADD 128.0.0.0/1 deny", "COVERED_DENY 0.0.0.0/0", "OVERLAP 192.168.1.0/24", "CONFLICTS 192.168.1.0/24"]

Expected Output: ["YES", "YES", "[]"]

Explanation: The two /1 deny rules partition the whole IPv4 space, so together they cover /0. There is overlap with the query block, but no allow rule, so no conflict.

Input: ["OVERLAP 1.2.3.0/24", "COVERED_ALLOW 1.2.3.0/24", "CONFLICTS 1.2.3.0/24"]

Expected Output: ["NO", "NO", "[]"]

Explanation: Edge case: with no rules, there is no overlap, no coverage, and no conflict.

Input: ["ADD 255.255.255.255/32 allow", "COVERED_ALLOW 255.255.255.255/32", "OVERLAP 255.255.255.254/31", "REMOVE 1", "OVERLAP 255.255.255.255/32"]

Expected Output: ["YES", "YES", "NO"]

Explanation: A /32 rule covers exactly one address. The /31 query overlaps it before removal and does not overlap after removal.

Input: ["ADD 192.168.1.0/25 allow", "COVERED_ALLOW 192.168.1.0/24"]

Expected Output: ["NO"]

Explanation: A single /25 covers only half of the /24, so full coverage fails.

Solution

def solution(commands):
    def ip_to_int(ip):
        a, b, c, d = map(int, ip.split('.'))
        return (a << 24) | (b << 16) | (c << 8) | d

    cidr_cache = {}

    def cidr_to_range(cidr):
        if cidr in cidr_cache:
            return cidr_cache[cidr]
        ip, plen = cidr.split('/')
        plen = int(plen)
        x = ip_to_int(ip)
        if plen == 0:
            start = 0
        else:
            mask = (0xFFFFFFFF << (32 - plen)) & 0xFFFFFFFF
            start = x & mask
        size = 1 << (32 - plen)
        end = start + size - 1
        cidr_cache[cidr] = (start, end)
        return start, end

    parsed = []
    coords = {0, 1 << 32}

    for cmd in commands:
        parts = cmd.split()
        op = parts[0]
        if op == 'ADD':
            cidr = parts[1]
            action = parts[2]
            s, e = cidr_to_range(cidr)
            coords.add(s)
            coords.add(e + 1)
            parsed.append((op, s, e, action))
        elif op == 'REMOVE':
            parsed.append((op, int(parts[1])))
        else:
            cidr = parts[1]
            s, e = cidr_to_range(cidr)
            coords.add(s)
            coords.add(e + 1)
            parsed.append((op, s, e))

    coords = sorted(coords)
    pos = {x: i for i, x in enumerate(coords)}
    seg_n = len(coords) - 1

    class SegTree:
        __slots__ = ('n', 'minv', 'maxv', 'lazy')

        def __init__(self, n):
            self.n = n
            self.minv = [0] * (4 * n)
            self.maxv = [0] * (4 * n)
            self.lazy = [0] * (4 * n)

        def _apply(self, idx, val):
            self.minv[idx] += val
            self.maxv[idx] += val
            self.lazy[idx] += val

        def _push(self, idx):
            val = self.lazy[idx]
            if val:
                self._apply(idx * 2, val)
                self._apply(idx * 2 + 1, val)
                self.lazy[idx] = 0

        def add(self, l, r, val, idx=1, tl=0, tr=None):
            if tr is None:
                tr = self.n - 1
            if l > r:
                return
            if l == tl and r == tr:
                self._apply(idx, val)
                return
            self._push(idx)
            tm = (tl + tr) // 2
            if r <= tm:
                self.add(l, r, val, idx * 2, tl, tm)
            elif l > tm:
                self.add(l, r, val, idx * 2 + 1, tm + 1, tr)
            else:
                self.add(l, tm, val, idx * 2, tl, tm)
                self.add(tm + 1, r, val, idx * 2 + 1, tm + 1, tr)
            self.minv[idx] = min(self.minv[idx * 2], self.minv[idx * 2 + 1])
            self.maxv[idx] = max(self.maxv[idx * 2], self.maxv[idx * 2 + 1])

        def query(self, l, r, idx=1, tl=0, tr=None):
            if tr is None:
                tr = self.n - 1
            if l == tl and r == tr:
                return self.minv[idx], self.maxv[idx]
            self._push(idx)
            tm = (tl + tr) // 2
            if r <= tm:
                return self.query(l, r, idx * 2, tl, tm)
            if l > tm:
                return self.query(l, r, idx * 2 + 1, tm + 1, tr)
            left = self.query(l, tm, idx * 2, tl, tm)
            right = self.query(tm + 1, r, idx * 2 + 1, tm + 1, tr)
            return min(left[0], right[0]), max(left[1], right[1])

    any_tree = SegTree(seg_n)
    allow_tree = SegTree(seg_n)
    deny_tree = SegTree(seg_n)

    active = {}
    next_id = 1
    answers = []

    for item in parsed:
        op = item[0]

        if op == 'ADD':
            _, s, e, action = item
            l = pos[s]
            r = pos[e + 1] - 1
            active[next_id] = (s, e, l, r, action)
            any_tree.add(l, r, 1)
            if action == 'allow':
                allow_tree.add(l, r, 1)
            else:
                deny_tree.add(l, r, 1)
            next_id += 1

        elif op == 'REMOVE':
            rid = item[1]
            info = active.pop(rid, None)
            if info is not None:
                _, _, l, r, action = info
                any_tree.add(l, r, -1)
                if action == 'allow':
                    allow_tree.add(l, r, -1)
                else:
                    deny_tree.add(l, r, -1)

        elif op == 'OVERLAP':
            _, s, e = item
            l = pos[s]
            r = pos[e + 1] - 1
            answers.append('YES' if any_tree.query(l, r)[1] > 0 else 'NO')

        elif op == 'COVERED_ALLOW':
            _, s, e = item
            l = pos[s]
            r = pos[e + 1] - 1
            answers.append('YES' if allow_tree.query(l, r)[0] > 0 else 'NO')

        elif op == 'COVERED_DENY':
            _, s, e = item
            l = pos[s]
            r = pos[e + 1] - 1
            answers.append('YES' if deny_tree.query(l, r)[0] > 0 else 'NO')

        elif op == 'CONFLICTS':
            _, s, e = item
            l = pos[s]
            r = pos[e + 1] - 1
            allow_max = allow_tree.query(l, r)[1]
            deny_max = deny_tree.query(l, r)[1]
            if allow_max == 0 or deny_max == 0:
                answers.append('[]')
                continue

            ids = []
            seen_allow = False
            seen_deny = False

            for rid, (rs, re, _, _, action) in active.items():
                if not (re < s or rs > e):
                    ids.append(rid)
                    if action == 'allow':
                        seen_allow = True
                    else:
                        seen_deny = True

            if seen_allow and seen_deny:
                answers.append(str(sorted(ids)))
            else:
                answers.append('[]')

    return answers
Explanation
**Approach: coordinate compression + three lazy segment trees.** Every IPv4 CIDR `ip/p` is turned into an inclusive integer interval `[start, end]` via `ip_to_int` and a `/p` mask (`cidr_to_range`, memoised). A `/0` block spans the whole `[0, 2^32-1]` space. **Coordinate compression.** In one pass we collect all `start` and `end+1` boundaries (plus sentinels `0` and `2^32`) into `coords`. Sorting them partitions the address space into elementary segments `[coords[i], coords[i+1])`, each represented by leaf `i`. Because both `s` and `e+1` of every block are in `coords`, any interval maps exactly to a contiguous leaf range `[l, r]` with `l = pos[s]`, `r = pos[e+1]-1`. **Three segment trees** (`any`, `allow`, `deny`) each support **range-add** with **lazy propagation** and store per-node `min`/`max` of segment counts. The count at a segment = how many active rules of that kind cover it. - **ADD**: assign the next id, range-`+1` on `any` and the matching `allow`/`deny` tree. - **REMOVE**: pop the rule, range-`-1` (no-op if absent — REMOVE on a stale id does nothing). - **OVERLAP**: `max` over `[l,r]` in `any` `> 0` ⇒ some rule intersects. - **COVERED_ALLOW / COVERED_DENY**: `min` over `[l,r]` `> 0` ⇒ *every* elementary segment in the block has ≥1 allow/deny rule, so the union fully covers it. - **CONFLICTS**: first prune with `allow_max==0 or deny_max==0` (no conflict possible); otherwise scan the `active` dict (O(A)) for rules whose interval overlaps `[s,e]`, and if both an allow and a deny appear, return their sorted ids. Correctness hinges on the compression: the elementary segments exactly tile each query block, so range min/max over leaves faithfully answers coverage and overlap.

Time complexity: Preprocessing O(C log C) for sorting C compressed coordinates (C = O(n) for n commands). ADD, REMOVE, OVERLAP, COVERED_ALLOW, COVERED_DENY are each O(log C). CONFLICTS is O(log C + A) where A is the number of currently active rules scanned to enumerate overlapping ids. Overall O((n + Q)·log C + sum of A over CONFLICTS queries).. Space complexity: O(C + A): the three segment trees use O(C) nodes each (4·segN arrays), and the active-rule dict and cidr cache hold O(A) and O(distinct CIDRs) entries respectively. C, A = O(n)..

Hints

  1. Turn each CIDR block into an integer interval [start, end]. Then overlap becomes a range problem.
  2. Because all commands are known before processing starts, coordinate compression plus a range-add segment tree can maintain coverage counts over elementary IP segments.
Last updated: Jun 2, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement a Snapshot Set Iterator - Databricks (medium)
  • Find the Best Commute Mode - Databricks (medium)
  • Partition a Target String by Source Substrings - Databricks (medium)
  • Design In-Memory QPS Counter - Databricks (medium)
  • Implement Random Connectivity and Grid Routing - Databricks