PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates string-processing and algorithmic competencies, specifically exact substring matching, interval merging for overlapping or adjacent matches, and robust text transformation to produce correctly tagged output.

  • medium
  • Harvey
  • Coding & Algorithms
  • Software Engineer

Highlight Exact Source Matches

Company: Harvey

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

You are given a string `text` and a list of strings `sources`. For every exact occurrence of any string in `sources` within `text`, wrap the matched characters with `<yellow>` and `</yellow>`. Rules: - Matching must be exact substring matching. - Matching is case-sensitive. - If multiple matched ranges overlap or are adjacent, merge them into one highlighted region and use only one pair of tags for the merged region. - Return the final tagged string. Example: - `text = "abcxyz123"` - `sources = ["abc", "xyz", "z12"]` - Matched ranges are `[0,3)`, `[3,6)`, and `[5,8)` - After merging, the highlighted range is `[0,8)` - Output: `"<yellow>abcxyz12</yellow>3"` Follow-up: 1. Count how many exact occurrences each source string has in the original `text`. 2. For each merged highlighted region, return the list of source strings that contributed at least one match inside that region.

Quick Answer: This question evaluates string-processing and algorithmic competencies, specifically exact substring matching, interval merging for overlapping or adjacent matches, and robust text transformation to produce correctly tagged output.

Part 1: Wrap Exact Source Matches in <yellow> Tags

Given a string text and a list of strings sources, highlight every exact occurrence of any source inside text by surrounding the matched characters with <yellow> and </yellow>. Matching is case-sensitive. If two matched ranges overlap or are directly adjacent, they must be merged into a single highlighted region and use only one pair of tags. Return the final tagged string. If an empty source string appears, ignore it.

Constraints

  • 0 <= len(text) <= 2000
  • 0 <= len(sources) <= 100
  • 0 <= len(source) <= 50 for each source in sources
  • Matching is exact substring matching and is case-sensitive
  • Overlapping or adjacent matches must be merged into one highlighted region

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12"])

Expected Output: "<yellow>abcxyz12</yellow>3"

Explanation: Matches are [0,3), [3,6), and [5,8). They merge into [0,8).

Input: ("foobarbaz", ["foo", "baz"])

Expected Output: "<yellow>foo</yellow>bar<yellow>baz</yellow>"

Explanation: There are two separate highlighted regions.

Input: ("aaab", ["aa", "aab"])

Expected Output: "<yellow>aaab</yellow>"

Explanation: Matches overlap and extend to cover the whole string.

Input: ("", ["abc"])

Expected Output: ""

Explanation: Empty text produces an empty result.

Input: ("hello", [])

Expected Output: "hello"

Explanation: With no sources, nothing is highlighted.

Hints

  1. Think about marking which character positions belong to at least one match.
  2. While scanning left to right, keep track of the farthest match end seen so far. This naturally merges overlaps and adjacency.

Part 2: Count Exact Occurrences of Each Source

Given a string text and a list of strings sources, return how many times each source appears in text. The result must be aligned with the original sources list, so result[i] is the number of exact occurrences of sources[i] in text. Matching is case-sensitive, and overlapping occurrences count separately. If a source string is empty, its count is 0.

Constraints

  • 0 <= len(text) <= 2000
  • 0 <= len(sources) <= 100
  • 0 <= len(source) <= 50 for each source in sources
  • Matching is exact substring matching and is case-sensitive
  • Overlapping matches count as separate occurrences

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12", "no"])

Expected Output: [1, 1, 1, 0]

Explanation: The first three sources appear once each, and no does not appear.

Input: ("aaaaa", ["aa", "aaa", "b"])

Expected Output: [4, 3, 0]

Explanation: Overlapping matches count: aa appears at 0,1,2,3 and aaa appears at 0,1,2.

Input: ("aAaA", ["aA", "Aa", "aa"])

Expected Output: [2, 1, 0]

Explanation: Matching is case-sensitive.

Input: ("", ["a", "aa"])

Expected Output: [0, 0]

Explanation: An empty text contains no non-empty substrings.

Input: ("hello", [])

Expected Output: []

Explanation: No sources means no counts to return.

Hints

  1. For each source, check every possible starting index in text.
  2. Do not skip forward by the full source length after a match if overlaps should count.

Part 3: Contributors for Each Merged Highlight Region

Given a string text and a list of strings sources, find every exact occurrence of any source in text. Treat each match as a half-open interval [start, end). Merge intervals that overlap or are adjacent. For each merged interval, return a tuple (start, end, contributors), where contributors is the list of distinct source strings that had at least one match inside that merged interval. Contributors must appear in the same order as their first appearance in sources. Matching is case-sensitive. If a source string is empty, ignore it.

Constraints

  • 0 <= len(text) <= 2000
  • 0 <= len(sources) <= 100
  • 0 <= len(source) <= 50 for each source in sources
  • Matching is exact substring matching and is case-sensitive
  • Overlapping or adjacent intervals must be merged
  • The contributors list for a region should contain each distinct source string at most once

Examples

Input: ("abcxyz123", ["abc", "xyz", "z12"])

Expected Output: [(0, 8, ["abc", "xyz", "z12"])]

Explanation: All three matches connect through overlap or adjacency, so they form one merged region.

Input: ("aaXXbbYYaa", ["aa", "bb", "YY"])

Expected Output: [(0, 2, ["aa"]), (4, 10, ["aa", "bb", "YY"])]

Explanation: The first aa is separate. The later bb, YY, and aa are adjacent and merge into one region. Contributors follow the input source order.

Input: ("mississippi", ["iss", "ssi", "ppi"])

Expected Output: [(1, 11, ["iss", "ssi", "ppi"])]

Explanation: A chain of overlaps and adjacency merges everything from index 1 through 10.

Input: ("abcabc", ["abc", "abc", "bc"])

Expected Output: [(0, 6, ["abc", "bc"])]

Explanation: Duplicate source strings count as one contributor label in the final region.

Input: ("hello", ["x", "y"])

Expected Output: []

Explanation: No matches means there are no merged highlight regions.

Hints

  1. Store the source string together with every interval you find.
  2. After sorting intervals by start index, merge them while maintaining a set of which sources contributed to the current region.
Last updated: Apr 22, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement Spreadsheet Cell Updates - Harvey (hard)
  • Evaluate Symbol Expressions - Harvey (medium)
  • Implement a Cursor-Based Text Editor - Harvey (medium)
  • Implement a Hierarchical File System - Harvey (medium)
  • Implement a Formula Spreadsheet - Harvey (medium)