How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Harvey.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Harvey during technical interviews.

Highlight Exact Source Matches | Harvey Coding Question

Highlight Exact Source Matches

Company: Harvey

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given a string `text` and a list of strings `sources`. For every exact occurrence of any string in `sources` within `text`, wrap the matched characters with `<yellow>` and `</yellow>`. Rules: - Matching must be exact substring matching. - Matching is case-sensitive. - If multiple matched ranges overlap or are adjacent, merge them into one highlighted region and use only one pair of tags for the merged region. - Return the final tagged string. Example: - `text = "abcxyz123"` - `sources = ["abc", "xyz", "z12"]` - Matched ranges are `[0,3)`, `[3,6)`, and `[5,8)` - After merging, the highlighted range is `[0,8)` - Output: `"<yellow>abcxyz12</yellow>3"` Follow-up: 1. Count how many exact occurrences each source string has in the original `text`. 2. For each merged highlighted region, return the list of source strings that contributed at least one match inside that region.

Quick Answer: This question evaluates string-processing and algorithmic competencies, specifically exact substring matching, interval merging for overlapping or adjacent matches, and robust text transformation to produce correctly tagged output.

Part 1: Wrap Exact Source Matches in <yellow> Tags

Given a string text and a list of strings sources, highlight every exact occurrence of any source inside text by surrounding the matched characters with <yellow> and </yellow>. Matching is case-sensitive. If two matched ranges overlap or are directly adjacent, they must be merged into a single highlighted region and use only one pair of tags. Return the final tagged string. If an empty source string appears, ignore it.

Constraints

0 <= len(text) <= 2000
0 <= len(sources) <= 100
0 <= len(source) <= 50 for each source in sources
Matching is exact substring matching and is case-sensitive
Overlapping or adjacent matches must be merged into one highlighted region

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12"])

Expected Output: "<yellow>abcxyz12</yellow>3"

Explanation: Matches are [0,3), [3,6), and [5,8). They merge into [0,8).

Input: ("foobarbaz", ["foo", "baz"])

Expected Output: "<yellow>foo</yellow>bar<yellow>baz</yellow>"

Explanation: There are two separate highlighted regions.

Input: ("aaab", ["aa", "aab"])

Expected Output: "<yellow>aaab</yellow>"

Explanation: Matches overlap and extend to cover the whole string.

Input: ("", ["abc"])

Expected Output: ""

Explanation: Empty text produces an empty result.

Input: ("hello", [])

Expected Output: "hello"

Explanation: With no sources, nothing is highlighted.

Hints

Think about marking which character positions belong to at least one match.
While scanning left to right, keep track of the farthest match end seen so far. This naturally merges overlaps and adjacency.

Part 2: Count Exact Occurrences of Each Source

Given a string text and a list of strings sources, return how many times each source appears in text. The result must be aligned with the original sources list, so result[i] is the number of exact occurrences of sources[i] in text. Matching is case-sensitive, and overlapping occurrences count separately. If a source string is empty, its count is 0.

Constraints

0 <= len(text) <= 2000
0 <= len(sources) <= 100
0 <= len(source) <= 50 for each source in sources
Matching is exact substring matching and is case-sensitive
Overlapping matches count as separate occurrences

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12", "no"])

Expected Output: [1, 1, 1, 0]

Explanation: The first three sources appear once each, and no does not appear.

Input: ("aaaaa", ["aa", "aaa", "b"])

Expected Output: [4, 3, 0]

Explanation: Overlapping matches count: aa appears at 0,1,2,3 and aaa appears at 0,1,2.

Input: ("aAaA", ["aA", "Aa", "aa"])

Expected Output: [2, 1, 0]

Explanation: Matching is case-sensitive.

Input: ("", ["a", "aa"])

Expected Output: [0, 0]

Explanation: An empty text contains no non-empty substrings.

Input: ("hello", [])

Expected Output: []

Explanation: No sources means no counts to return.

Hints

For each source, check every possible starting index in text.
Do not skip forward by the full source length after a match if overlaps should count.

Part 3: Contributors for Each Merged Highlight Region

Given a string text and a list of strings sources, find every exact occurrence of any source in text. Treat each match as a half-open interval [start, end). Merge intervals that overlap or are adjacent. For each merged interval, return a tuple (start, end, contributors), where contributors is the list of distinct source strings that had at least one match inside that merged interval. Contributors must appear in the same order as their first appearance in sources. Matching is case-sensitive. If a source string is empty, ignore it.

Constraints

0 <= len(text) <= 2000
0 <= len(sources) <= 100
0 <= len(source) <= 50 for each source in sources
Matching is exact substring matching and is case-sensitive
Overlapping or adjacent intervals must be merged
The contributors list for a region should contain each distinct source string at most once

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12"])

Expected Output: [(0, 8, ["abc", "xyz", "z12"])]

Explanation: All three matches connect through overlap or adjacency, so they form one merged region.

Input: ("aaXXbbYYaa", ["aa", "bb", "YY"])

Expected Output: [(0, 2, ["aa"]), (4, 10, ["aa", "bb", "YY"])]

Explanation: The first aa is separate. The later bb, YY, and aa are adjacent and merge into one region. Contributors follow the input source order.

Input: ("mississippi", ["iss", "ssi", "ppi"])

Expected Output: [(1, 11, ["iss", "ssi", "ppi"])]

Explanation: A chain of overlaps and adjacency merges everything from index 1 through 10.

Input: ("abcabc", ["abc", "abc", "bc"])

Expected Output: [(0, 6, ["abc", "bc"])]

Explanation: Duplicate source strings count as one contributor label in the final region.

Input: ("hello", ["x", "y"])

Expected Output: []

Explanation: No matches means there are no merged highlight regions.

Hints

Store the source string together with every interval you find.
After sorting intervals by start index, merge them while maintaining a set of which sources contributed to the current region.

Quick Overview

This question evaluates string-processing and algorithmic competencies, specifically exact substring matching, interval merging for overlapping or adjacent matches, and robust text transformation to produce correctly tagged output.