How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a Medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at DoorDash.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at DoorDash during technical interviews.

Implement string matching with follow-up | DoorDash Coding Question

Quick Overview

A DoorDash software-engineer coding screen on string matching: implement firstIndex(T, P) in O(|T| + |P|) time and O(|P|) space using KMP, then handle follow-ups on naive vs KMP vs Rabin–Karp, Unicode and very long inputs, edge cases, wildcard ('?'/'*') matching, and simultaneous multi-pattern search via Aho–Corasick. Includes corrected, complete solutions and complexity trade-offs.

Implement string matching with follow-up

Company: DoorDash

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Question Given a text `T` and a pattern `P`, implement a function `firstIndex(T, P)` that returns the starting index of the first occurrence of `P` in `T`, or `-1` if `P` does not occur. Aim for `O(|T| + |P|)` time and `O(|P|)` extra space. Be ready to extend the solution through the following follow-ups: 1. **Algorithm comparison.** Compare the naive (brute-force) approach to KMP and Rabin–Karp. Explain the time and space complexity of each and when you would choose one over another. 2. **Robustness.** How would you handle Unicode text (multi-byte / combining characters) and very long inputs? Discuss correctness and memory considerations. 3. **Edge cases.** Walk through the important edge cases: empty pattern, empty text, pattern longer than text, and highly repetitive patterns (e.g. `P = "aaaa"` in `T = "aaaaaa"`). 4. **Wildcards.** Extend the matcher to support two wildcard characters in the pattern: `?` matches any single character, and `*` matches any (possibly empty) sequence of characters. Describe the matching semantics and complexity. 5. **Multi-pattern search.** Extend the solution to search for `k` patterns `P1..Pk` in `T` simultaneously and return all starting indices for each pattern. Outline the data structure you would use and the runtime / space trade-offs versus running a single-pattern matcher `k` times.

Quick Answer: A DoorDash software-engineer coding screen on string matching: implement firstIndex(T, P) in O(|T| + |P|) time and O(|P|) space using KMP, then handle follow-ups on naive vs KMP vs Rabin–Karp, Unicode and very long inputs, edge cases, wildcard ('?'/'*') matching, and simultaneous multi-pattern search via Aho–Corasick. Includes corrected, complete solutions and complexity trade-offs.

First Occurrence of a Pattern (KMP)

Given a text `t` and a pattern `p`, implement `solution(t, p)` that returns the starting index of the first occurrence of `p` in `t`, or `-1` if `p` does not occur. Aim for `O(|t| + |p|)` time and `O(|p|)` extra space. Conventions: - An empty pattern matches at index `0`. - If `p` is longer than `t`, return `-1`. - Overlapping is irrelevant for the *first* occurrence; just return the earliest start index. Use the Knuth–Morris–Pratt failure (LPS) table so the text pointer never backtracks — this keeps the search linear even on highly repetitive inputs like `p = "aaaa"` inside `t = "aaaaaa"`, where a naive scan degrades to `O(|t|·|p|)`.

Constraints

0 <= |t|, |p|
Characters are compared by code point (decode to code points for Unicode-correct indexing).
An empty pattern returns 0.
If |p| > |t|, return -1.

Examples

Input: ("abxabcabcaby", "abcaby")

Expected Output: 6

Explanation: The pattern 'abcaby' first appears starting at index 6.

Input: ("hello world", "world")

Expected Output: 6

Explanation: 'world' begins at index 6.

Input: ("aaaaaa", "aaaa")

Expected Output: 0

Explanation: Highly repetitive case; KMP stays linear and reports the earliest match at index 0.

Input: ("abcabc", "abcd")

Expected Output: -1

Explanation: No occurrence of 'abcd'.

Input: ("", "a")

Expected Output: -1

Explanation: Empty text cannot contain a non-empty pattern.

Input: ("abc", "")

Expected Output: 0

Explanation: Empty pattern matches at index 0 by convention.

Input: ("a", "abc")

Expected Output: -1

Explanation: Pattern longer than text -> -1.

Input: ("mississippi", "issip")

Expected Output: 4

Explanation: 'issip' starts at index 4.

Input: ("aaab", "aaab")

Expected Output: 0

Explanation: Pattern equals the whole text; match at index 0.

Hints

A naive scan is O(|t|·|p|) worst case and blows up on repetitive inputs like 'aaaa' in 'aaaaaa'. You need the text pointer to never move backward.
Precompute the LPS (longest proper prefix that is also a suffix) array of the pattern in O(|p|). On a mismatch at pattern position k, fall back to lps[k-1] instead of restarting.
Handle the conventions first: empty pattern -> 0, and |p| > |t| -> -1, before building the table.

Wildcard Pattern Matching (? and *)

Follow-up 4. Extend the matcher to support two wildcard characters in the pattern `p`: - `?` matches any single character. - `*` matches any (possibly empty) sequence of characters. Implement `solution(t, p)` returning a boolean: does the pattern match the **entire** text `t`? This is glob-style (anchored) matching, not substring search. A single linear scan is insufficient because `*` can match an arbitrary span — use dynamic programming. `dp[i][j]` = does `p[0..j)` match `t[0..i)`.

Constraints

0 <= |t|, |p|
p may contain '?' and '*' wildcards; all other characters are literal.
Matching is anchored: the whole pattern must match the whole text.
A run of leading '*' can match the empty prefix.

Examples

Input: ("abcde", "a?c*")

Expected Output: True

Explanation: 'a' literal, '?'=b, 'c' literal, '*'=de.

Input: ("abcde", "a*e")

Expected Output: True

Explanation: '*' absorbs 'bcd' between the anchors 'a' and 'e'.

Input: ("abcde", "a*f")

Expected Output: False

Explanation: Text does not end with 'f'.

Input: ("", "*")

Expected Output: True

Explanation: '*' matches the empty string.

Input: ("", "")

Expected Output: True

Explanation: Empty matches empty.

Input: ("a", "")

Expected Output: False

Explanation: Empty pattern cannot cover a non-empty text under anchored matching.

Input: ("abc", "???")

Expected Output: True

Explanation: Three '?' match three characters exactly.

Input: ("abc", "??")

Expected Output: False

Explanation: Two '?' cannot cover three characters.

Input: ("aaaa", "*a*")

Expected Output: True

Explanation: Stars on both sides plus a required 'a' in between.

Input: ("xyz", "*")

Expected Output: True

Explanation: A single '*' matches everything.

Input: ("mississippi", "m*ss*p?")

Expected Output: True

Explanation: 'm', '*'->'i', 'ss', '*'->'issi', 'p', '?'->'i'... the stars absorb the spans so the whole string matches.

Hints

'*' makes a single linear scan insufficient — model it with DP: dp[i][j] = does p[0..j) match t[0..i).
Base row: dp[0][j] is True only while every pattern char so far is '*' (each '*' can match the empty string).
For '*': dp[i][j] = dp[i-1][j] (star consumes t[i-1]) OR dp[i][j-1] (star matches empty). For '?' or a literal match: dp[i][j] = dp[i-1][j-1].

Multi-Pattern Search (Aho–Corasick)

Follow-up 5. Search for `k` patterns simultaneously in a single pass over the text. Implement `solution(text, patterns)` where `patterns` is a list of strings. Return a list of lists `result` such that `result[i]` is the list of all start indices (in ascending order, including overlapping occurrences) at which `patterns[i]` occurs in `text`. Rules: - The output order must align with the input pattern order. Duplicate patterns each get their own (identical) result list. - An empty pattern contributes an empty list. - Running a single-pattern matcher k times costs O(k·|text| + Σ|Pi|). Build an Aho–Corasick automaton (a trie with KMP-style failure links + output links) to scan the text once: O(Σ|Pi| + |text| + z) where z is the number of reported matches, independent of k for the scan.

Constraints

0 <= number of patterns k
Patterns may repeat; each repeated pattern gets its own result entry.
Overlapping occurrences are all reported.
An empty pattern contributes an empty list.

Examples

Input: ("ahishers", ["he", "she", "his", "hers"])

Expected Output: [[4], [3], [1], [4]]

Explanation: Classic Aho–Corasick example: 'he'@4, 'she'@3, 'his'@1, 'hers'@4 (overlapping 'he'/'hers' share the same scan).

Input: ("aaaa", ["a", "aa"])

Expected Output: [[0, 1, 2, 3], [0, 1, 2]]

Explanation: Overlapping matches: 'a' at every index, 'aa' at 0,1,2.

Input: ("abcabc", ["abc", "bc", "d"])

Expected Output: [[0, 3], [1, 4], []]

Explanation: 'abc'@0,3; 'bc'@1,4; 'd' never occurs.

Input: ("xyz", ["a", "b"])

Expected Output: [[], []]

Explanation: No pattern occurs.

Input: ("", ["a"])

Expected Output: [[]]

Explanation: Empty text yields no matches.

Input: ("aaa", ["a", "a"])

Expected Output: [[0, 1, 2], [0, 1, 2]]

Explanation: Duplicate patterns each get their own identical result list.

Input: ("mississippi", ["issi", "ssip", "ppi"])

Expected Output: [[1, 4], [5], [8]]

Explanation: 'issi'@1,4 (overlapping); 'ssip'@5; 'ppi'@8.

Hints

Running KMP k separate times is O(k·|text| + Σ|Pi|). To scan the text once, generalize KMP's failure function to a trie over all patterns.
Build a trie of all patterns, then add failure links via BFS (the longest proper suffix of the current node that is also a prefix in the trie), plus output links so one node can report multiple patterns ending there.
Merge each node's output list with its failure node's output list during the BFS so a single position emits every pattern that ends there, including shorter suffixes.

Quick Overview

Implement string matching with follow-up

Company: DoorDash

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

First Occurrence of a Pattern (KMP)

Constraints

0 <= |t|, |p|
Characters are compared by code point (decode to code points for Unicode-correct indexing).
An empty pattern returns 0.
If |p| > |t|, return -1.

Examples

Input: ("abxabcabcaby", "abcaby")

Expected Output: 6

Explanation: The pattern 'abcaby' first appears starting at index 6.

Input: ("hello world", "world")

Expected Output: 6

Explanation: 'world' begins at index 6.

Input: ("aaaaaa", "aaaa")

Expected Output: 0

Explanation: Highly repetitive case; KMP stays linear and reports the earliest match at index 0.

Input: ("abcabc", "abcd")

Expected Output: -1

Explanation: No occurrence of 'abcd'.

Input: ("", "a")

Expected Output: -1

Explanation: Empty text cannot contain a non-empty pattern.

Input: ("abc", "")

Expected Output: 0

Explanation: Empty pattern matches at index 0 by convention.

Input: ("a", "abc")

Expected Output: -1

Explanation: Pattern longer than text -> -1.

Input: ("mississippi", "issip")

Expected Output: 4

Explanation: 'issip' starts at index 4.

Input: ("aaab", "aaab")

Expected Output: 0

Explanation: Pattern equals the whole text; match at index 0.

Hints

A naive scan is O(|t|·|p|) worst case and blows up on repetitive inputs like 'aaaa' in 'aaaaaa'. You need the text pointer to never move backward.
Precompute the LPS (longest proper prefix that is also a suffix) array of the pattern in O(|p|). On a mismatch at pattern position k, fall back to lps[k-1] instead of restarting.
Handle the conventions first: empty pattern -> 0, and |p| > |t| -> -1, before building the table.

Wildcard Pattern Matching (? and *)

Constraints

0 <= |t|, |p|
p may contain '?' and '*' wildcards; all other characters are literal.
Matching is anchored: the whole pattern must match the whole text.
A run of leading '*' can match the empty prefix.

Examples

Input: ("abcde", "a?c*")

Expected Output: True

Explanation: 'a' literal, '?'=b, 'c' literal, '*'=de.

Input: ("abcde", "a*e")

Expected Output: True

Explanation: '*' absorbs 'bcd' between the anchors 'a' and 'e'.

Input: ("abcde", "a*f")

Expected Output: False

Explanation: Text does not end with 'f'.

Input: ("", "*")

Expected Output: True

Explanation: '*' matches the empty string.

Input: ("", "")

Expected Output: True

Explanation: Empty matches empty.

Input: ("a", "")

Expected Output: False

Explanation: Empty pattern cannot cover a non-empty text under anchored matching.

Input: ("abc", "???")

Expected Output: True

Explanation: Three '?' match three characters exactly.

Input: ("abc", "??")

Expected Output: False

Explanation: Two '?' cannot cover three characters.

Input: ("aaaa", "*a*")

Expected Output: True

Explanation: Stars on both sides plus a required 'a' in between.

Input: ("xyz", "*")

Expected Output: True

Explanation: A single '*' matches everything.

Input: ("mississippi", "m*ss*p?")

Expected Output: True

Explanation: 'm', '*'->'i', 'ss', '*'->'issi', 'p', '?'->'i'... the stars absorb the spans so the whole string matches.

Hints

'*' makes a single linear scan insufficient — model it with DP: dp[i][j] = does p[0..j) match t[0..i).
Base row: dp[0][j] is True only while every pattern char so far is '*' (each '*' can match the empty string).
For '*': dp[i][j] = dp[i-1][j] (star consumes t[i-1]) OR dp[i][j-1] (star matches empty). For '?' or a literal match: dp[i][j] = dp[i-1][j-1].

Multi-Pattern Search (Aho–Corasick)

Constraints

0 <= number of patterns k
Patterns may repeat; each repeated pattern gets its own result entry.
Overlapping occurrences are all reported.
An empty pattern contributes an empty list.

Examples

Input: ("ahishers", ["he", "she", "his", "hers"])

Expected Output: [[4], [3], [1], [4]]

Explanation: Classic Aho–Corasick example: 'he'@4, 'she'@3, 'his'@1, 'hers'@4 (overlapping 'he'/'hers' share the same scan).

Input: ("aaaa", ["a", "aa"])

Expected Output: [[0, 1, 2, 3], [0, 1, 2]]

Explanation: Overlapping matches: 'a' at every index, 'aa' at 0,1,2.

Input: ("abcabc", ["abc", "bc", "d"])

Expected Output: [[0, 3], [1, 4], []]

Explanation: 'abc'@0,3; 'bc'@1,4; 'd' never occurs.

Input: ("xyz", ["a", "b"])

Expected Output: [[], []]

Explanation: No pattern occurs.

Input: ("", ["a"])

Expected Output: [[]]

Explanation: Empty text yields no matches.

Input: ("aaa", ["a", "a"])

Expected Output: [[0, 1, 2], [0, 1, 2]]

Explanation: Duplicate patterns each get their own identical result list.

Input: ("mississippi", ["issi", "ssip", "ppi"])

Expected Output: [[1, 4], [5], [8]]

Explanation: 'issi'@1,4 (overlapping); 'ssip'@5; 'ppi'@8.

Hints

Running KMP k separate times is O(k·|text| + Σ|Pi|). To scan the text once, generalize KMP's failure function to a trie over all patterns.
Build a trie of all patterns, then add failure links via BFS (the longest proper suffix of the current node that is also a prefix in the trie), plus output links so one node can report multiple patterns ending there.
Merge each node's output list with its failure node's output list during the BFS so a single position emits every pattern that ends there, including shorter suffixes.