How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Snowflake.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Snowflake during technical interviews.

Implement Document Predicate APIs | Snowflake Coding Question

Implement Document Predicate APIs

Company: Snowflake

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Design a simple document service that stores text documents identified by filename. Implement these APIs: - `InsertDoc(filename, content)`: store or overwrite a document. - `CheckContains(filename, predicate) -> bool`: return whether the document satisfies a boolean keyword expression. Rules for `predicate`: - A term is a word such as `a`, `b`, or `error`. - `term` is true if that word appears anywhere in the document. - Operators supported: `&&` and `||`. - `&&` has higher precedence than `||`. - You may also support parentheses for grouping. Examples: - `a || b || c` is true if the document contains at least one of `a`, `b`, or `c`. - `a && b || c` is true if the document contains both `a` and `b`, or contains `c`. Follow-up 1: - Implement `GetAllFiles(predicate) -> List[filename]`, which returns all documents matching the predicate. Follow-up 2: - Explain how you would scale this service in a distributed system, including a reasonable sharding or partitioning strategy and a replication policy. Assume you are starting from a blank editor or whiteboard and should define any necessary data structures and test cases.

Quick Answer: This question evaluates the ability to design a simple document storage and boolean keyword predicate evaluation, testing parsing of logical expressions with operator precedence and the selection of data structures for efficient term lookup.

Part 1: Implement InsertDoc and CheckContains for Boolean Document Predicates

Design a tiny in-memory document service. You are given a sequence of operations. An ('insert', filename, content) operation stores or overwrites a document. A ('check', filename, predicate) operation returns whether the named document satisfies a boolean predicate. Content is a space-separated list of words and a term is true iff that exact word appears in the document. Predicates contain terms, '&&', '||', and optional parentheses. '&&' has higher precedence than '||'. Return the results of all check operations in order. If a file does not exist, treat it as an empty document.

Constraints

1 <= len(operations) <= 10^4
Filenames are non-empty strings.
Document content may be empty; otherwise it is whitespace-separated words.
Predicates are valid and contain words, '&&', '||', parentheses, and optional spaces.
Word matching is exact on whitespace-separated tokens.

Examples

Input: ([('insert', 'doc1', 'a b'), ('check', 'doc1', 'a && b || c'), ('check', 'doc1', 'a && (b || c)'), ('check', 'doc1', 'c || d')],)

Expected Output: [True, True, False]

Explanation: '&&' binds tighter than '||', and parentheses must also be respected.

Input: ([('insert', 'doc1', 'a'), ('check', 'doc1', 'a && b'), ('insert', 'doc1', 'b c'), ('check', 'doc1', 'a || b'), ('check', 'missing', 'a')],)

Expected Output: [False, True, False]

Explanation: Insert overwrites the previous content. A missing file behaves like an empty document.

Input: ([('insert', 'doc2', ''), ('check', 'doc2', 'a||b'), ('insert', 'doc2', 'error'), ('check', 'doc2', '(a&&b)||error')],)

Expected Output: [False, True]

Explanation: This includes an empty-content edge case and a predicate without spaces.

Input: ([('insert', 'doc3', 'alpha beta gamma'), ('check', 'doc3', 'beta'), ('check', 'doc3', 'delta || gamma && alpha')],)

Expected Output: [True, True]

Explanation: A single term should work, and precedence makes 'gamma && alpha' evaluate before the final '||'.

Hints

Store each document as a set of words so checking a single term is O(1).
A recursive-descent parser is a clean way to enforce '&&' precedence over '||' and handle parentheses.

Part 2: Implement GetAllFiles(predicate) to Return Matching Filenames

Extend the document service with a query that returns every filename whose current content matches a boolean predicate. You are given operations of two kinds: ('insert', filename, content) to store or overwrite a document, and ('get', predicate) to return all matching filenames. Content is a space-separated list of words and a term is true iff that exact word appears in a document. Predicates contain terms, '&&', '||', and optional parentheses, with '&&' having higher precedence than '||'. Return the results of all get operations. Each result must be sorted lexicographically.

Constraints

1 <= len(operations) <= 10^4
Filenames are strings and inserts overwrite older content for the same filename.
Document content may be empty; otherwise it is whitespace-separated words.
Predicates are valid and contain words, '&&', '||', parentheses, and optional spaces.
Returned filenames for each query must be in ascending lexicographic order.

Examples

Input: ([('insert', 'a.txt', 'red blue'), ('insert', 'b.txt', 'blue green'), ('insert', 'c.txt', 'green'), ('get', 'red || green'), ('get', 'red && blue')],)

Expected Output: [['a.txt', 'b.txt', 'c.txt'], ['a.txt']]

Explanation: The first query matches any file containing red or green. The second matches only files containing both red and blue.

Input: ([('insert', 'x', 'a'), ('insert', 'y', 'b'), ('get', 'a || b'), ('insert', 'x', 'c'), ('get', 'a || b'), ('get', 'c && a')],)

Expected Output: [['x', 'y'], ['y'], []]

Explanation: Overwriting x removes it from the postings list for 'a' and adds it to the postings list for 'c'.

Input: ([('get', 'a'), ('insert', 'z', ''), ('get', 'a || b')],)

Expected Output: [[], []]

Explanation: An empty store and an empty document are both valid edge cases.

Input: ([('insert', 'd1', 'error warning'), ('insert', 'd2', 'warning'), ('insert', 'd3', 'error'), ('get', '(error&&warning)||missing'), ('get', 'error||warning')],)

Expected Output: [['d1'], ['d1', 'd2', 'd3']]

Explanation: Parentheses and predicates without spaces should still parse correctly.

Hints

An inverted index from word -> set of filenames lets a single term evaluate directly to matching documents.
Once terms become sets, '&&' is set intersection and '||' is set union.

Part 3: Simulate a Sharding and Replication Strategy for the Document Service

Model one reasonable distributed design for the document service. Given an ordered list of storage nodes, a number of logical shards, a replication factor, and a list of filenames, assign each file to a shard and then to replica nodes. Use this strategy: stable_hash(filename) = sum(ord(c) for c in filename); shard_id = stable_hash(filename) % num_shards; primary node = nodes[shard_id % len(nodes)]; additional replicas go to the next distinct nodes in circular order until you have replication_factor copies or run out of nodes. Also given a list of failed nodes and a list of filenames to read, route each read to the first non-failed replica for that file, or None if every replica is down or the file was never placed. Return both the placement plan and the chosen read routes.

Constraints

1 <= len(nodes) <= 10^4
1 <= num_shards <= 10^9
1 <= replication_factor
Filenames and node names are strings.
If replication_factor > len(nodes), use each node at most once.

Examples

Input: (['n1', 'n2', 'n3'], 4, 2, ['a', 'b', 'c'], ['n2'], ['a', 'b', 'c'])

Expected Output: {'placements': {'a': {'shard': 1, 'replicas': ['n2', 'n3']}, 'b': {'shard': 2, 'replicas': ['n3', 'n1']}, 'c': {'shard': 3, 'replicas': ['n1', 'n2']}}, 'routes': {'a': 'n3', 'b': 'n3', 'c': 'n1'}}

Explanation: File 'a' falls on shard 1 whose primary is n2, but n2 is failed so reads fall back to n3.

Input: (['x', 'y'], 3, 5, ['doc'], ['x', 'y'], ['doc', 'missing'])

Expected Output: {'placements': {'doc': {'shard': 1, 'replicas': ['y', 'x']}}, 'routes': {'doc': None, 'missing': None}}

Explanation: Replication factor exceeds node count, so only two distinct replicas are used. Both are down.

Input: (['solo'], 10, 3, ['aa', 'bb'], [], ['aa', 'bb'])

Expected Output: {'placements': {'aa': {'shard': 4, 'replicas': ['solo']}, 'bb': {'shard': 6, 'replicas': ['solo']}}, 'routes': {'aa': 'solo', 'bb': 'solo'}}

Explanation: With one node, every file maps to that node regardless of requested replication factor.

Input: (['nA', 'nB', 'nC', 'nD'], 5, 3, ['d'], ['nA'], ['d'])

Expected Output: {'placements': {'d': {'shard': 0, 'replicas': ['nA', 'nB', 'nC']}}, 'routes': {'d': 'nB'}}

Explanation: The primary replica is down, so the read is routed to the next healthy replica.

Hints

First compute the logical shard for a file, then map that shard to a primary node and walk forward circularly for replicas.
Cap the replica count at the number of nodes so you never place duplicate replicas on the same node.

Implement Document Predicate APIs

Company: Snowflake

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Part 1: Implement InsertDoc and CheckContains for Boolean Document Predicates

Constraints

1 <= len(operations) <= 10^4
Filenames are non-empty strings.
Document content may be empty; otherwise it is whitespace-separated words.
Predicates are valid and contain words, '&&', '||', parentheses, and optional spaces.
Word matching is exact on whitespace-separated tokens.

Examples

Input: ([('insert', 'doc1', 'a b'), ('check', 'doc1', 'a && b || c'), ('check', 'doc1', 'a && (b || c)'), ('check', 'doc1', 'c || d')],)

Expected Output: [True, True, False]

Explanation: '&&' binds tighter than '||', and parentheses must also be respected.

Input: ([('insert', 'doc1', 'a'), ('check', 'doc1', 'a && b'), ('insert', 'doc1', 'b c'), ('check', 'doc1', 'a || b'), ('check', 'missing', 'a')],)

Expected Output: [False, True, False]

Explanation: Insert overwrites the previous content. A missing file behaves like an empty document.

Input: ([('insert', 'doc2', ''), ('check', 'doc2', 'a||b'), ('insert', 'doc2', 'error'), ('check', 'doc2', '(a&&b)||error')],)

Expected Output: [False, True]

Explanation: This includes an empty-content edge case and a predicate without spaces.

Input: ([('insert', 'doc3', 'alpha beta gamma'), ('check', 'doc3', 'beta'), ('check', 'doc3', 'delta || gamma && alpha')],)

Expected Output: [True, True]

Explanation: A single term should work, and precedence makes 'gamma && alpha' evaluate before the final '||'.

Hints

Store each document as a set of words so checking a single term is O(1).
A recursive-descent parser is a clean way to enforce '&&' precedence over '||' and handle parentheses.

Part 2: Implement GetAllFiles(predicate) to Return Matching Filenames

Constraints

1 <= len(operations) <= 10^4
Filenames are strings and inserts overwrite older content for the same filename.
Document content may be empty; otherwise it is whitespace-separated words.
Predicates are valid and contain words, '&&', '||', parentheses, and optional spaces.
Returned filenames for each query must be in ascending lexicographic order.

Examples

Input: ([('insert', 'a.txt', 'red blue'), ('insert', 'b.txt', 'blue green'), ('insert', 'c.txt', 'green'), ('get', 'red || green'), ('get', 'red && blue')],)

Expected Output: [['a.txt', 'b.txt', 'c.txt'], ['a.txt']]

Explanation: The first query matches any file containing red or green. The second matches only files containing both red and blue.

Input: ([('insert', 'x', 'a'), ('insert', 'y', 'b'), ('get', 'a || b'), ('insert', 'x', 'c'), ('get', 'a || b'), ('get', 'c && a')],)

Expected Output: [['x', 'y'], ['y'], []]

Explanation: Overwriting x removes it from the postings list for 'a' and adds it to the postings list for 'c'.

Input: ([('get', 'a'), ('insert', 'z', ''), ('get', 'a || b')],)

Expected Output: [[], []]

Explanation: An empty store and an empty document are both valid edge cases.

Input: ([('insert', 'd1', 'error warning'), ('insert', 'd2', 'warning'), ('insert', 'd3', 'error'), ('get', '(error&&warning)||missing'), ('get', 'error||warning')],)

Expected Output: [['d1'], ['d1', 'd2', 'd3']]

Explanation: Parentheses and predicates without spaces should still parse correctly.

Hints

An inverted index from word -> set of filenames lets a single term evaluate directly to matching documents.
Once terms become sets, '&&' is set intersection and '||' is set union.

Part 3: Simulate a Sharding and Replication Strategy for the Document Service

Constraints

1 <= len(nodes) <= 10^4
1 <= num_shards <= 10^9
1 <= replication_factor
Filenames and node names are strings.
If replication_factor > len(nodes), use each node at most once.

Examples

Input: (['n1', 'n2', 'n3'], 4, 2, ['a', 'b', 'c'], ['n2'], ['a', 'b', 'c'])

Explanation: File 'a' falls on shard 1 whose primary is n2, but n2 is failed so reads fall back to n3.

Input: (['x', 'y'], 3, 5, ['doc'], ['x', 'y'], ['doc', 'missing'])

Expected Output: {'placements': {'doc': {'shard': 1, 'replicas': ['y', 'x']}}, 'routes': {'doc': None, 'missing': None}}

Explanation: Replication factor exceeds node count, so only two distinct replicas are used. Both are down.

Input: (['solo'], 10, 3, ['aa', 'bb'], [], ['aa', 'bb'])

Expected Output: {'placements': {'aa': {'shard': 4, 'replicas': ['solo']}, 'bb': {'shard': 6, 'replicas': ['solo']}}, 'routes': {'aa': 'solo', 'bb': 'solo'}}

Explanation: With one node, every file maps to that node regardless of requested replication factor.

Input: (['nA', 'nB', 'nC', 'nD'], 5, 3, ['d'], ['nA'], ['d'])

Expected Output: {'placements': {'d': {'shard': 0, 'replicas': ['nA', 'nB', 'nC']}}, 'routes': {'d': 'nB'}}

Explanation: The primary replica is down, so the read is routed to the next healthy replica.

Hints

First compute the logical shard for a file, then map that shard to a primary node and walk forward circularly for replicas.
Cap the replica count at the number of nodes so you never place duplicate replicas on the same node.

Quick Overview

Quick Overview