PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in string processing, character-frequency analysis, and efficient substring search techniques, measuring competency in algorithmic reasoning and implementation for pattern matching.

  • medium
  • Databricks
  • Coding & Algorithms
  • Backend Engineer

Find First Anagram Occurrence

Company: Databricks

Role: Backend Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Given two strings `text` and `pattern`, return the starting index of the first substring in `text` that is an anagram of `pattern`. If no such substring exists, return `-1`. A substring is an anagram of `pattern` if it has exactly the same character frequencies as `pattern`, possibly in a different order. Assume the input strings contain lowercase English letters unless otherwise specified. Example: ```text text = "cbaebabacd" pattern = "abc" ``` The substrings of length `3` are checked from left to right. The first substring that is an anagram of `"abc"` is `"cba"`, so the answer is `0`. Follow-up: Explain how your approach would scale if `text` were extremely large, streamed in chunks, or distributed across multiple machines.

Quick Answer: This question evaluates proficiency in string processing, character-frequency analysis, and efficient substring search techniques, measuring competency in algorithmic reasoning and implementation for pattern matching.

Given two strings `text` and `pattern`, return the starting index of the first substring in `text` that is an anagram of `pattern`. If no such substring exists, return `-1`. A substring is an anagram of `pattern` if it has exactly the same character frequencies as `pattern`, possibly in a different order. Assume the input strings contain lowercase English letters unless otherwise specified. Example: ``` text = "cbaebabacd" pattern = "abc" ``` The substrings of length `3` are checked from left to right. The first substring that is an anagram of `"abc"` is `"cba"`, so the answer is `0`. Use a fixed-size sliding window of length `len(pattern)`: maintain a character-frequency count for the current window and compare it against the frequency count of `pattern`. Slide one character at a time, adding the incoming character and removing the outgoing one in O(1). Follow-up: If `text` were extremely large, streamed in chunks, or distributed across machines, the sliding window only needs O(26) state per position, so you can process the stream incrementally; for a distributed split, each shard scans its slice plus an overlap region of `len(pattern) - 1` characters from the next shard so anagrams straddling a boundary are not missed.

Constraints

  • 0 <= len(text), len(pattern)
  • text and pattern consist of lowercase English letters ('a'-'z')
  • If pattern is empty, the answer is 0 (the empty string is an anagram of the empty pattern at index 0)
  • If len(pattern) > len(text), the answer is -1

Examples

Input: ("cbaebabacd", "abc")

Expected Output: 0

Explanation: The length-3 window "cba" at index 0 is an anagram of "abc", so 0 is returned immediately.

Input: ("abab", "ab")

Expected Output: 0

Explanation: The first window "ab" is already an anagram of "ab".

Input: ("af", "be")

Expected Output: -1

Explanation: No window has the same character frequencies as "be".

Input: ("", "a")

Expected Output: -1

Explanation: text is empty and shorter than pattern, so no anagram exists.

Input: ("a", "")

Expected Output: 0

Explanation: An empty pattern matches at index 0 by convention.

Input: ("hello", "oll")

Expected Output: 2

Explanation: Windows "hel", "ell" do not match; "llo" at index 2 is an anagram of "oll".

Input: ("baa", "aa")

Expected Output: 1

Explanation: Window "ba" at index 0 is not an anagram; "aa" at index 1 is.

Input: ("abc", "abcd")

Expected Output: -1

Explanation: pattern is longer than text, so no window of the required length exists.

Input: ("xyzabc", "cba")

Expected Output: 3

Explanation: The first matching window "abc" begins at index 3; earlier windows do not match "cba"'s frequencies.

Hints

  1. Two strings are anagrams iff they have identical character-frequency counts. Comparing 26-element count arrays is O(26).
  2. Only substrings of length exactly len(pattern) can be anagrams, so slide a fixed-size window across text.
  3. When the window moves right by one, add the entering character's count and subtract the leaving character's count instead of recomputing the whole window — that keeps each step O(1).
  4. Return as soon as the window's frequency array equals pattern's; if you reach the end without a match, return -1.
Last updated: Jun 26, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Choose the Best Travel Mode - Databricks (medium)
  • Implement an Alternating Tic-Tac-Toe Game - Databricks (hard)
  • Implement a Snapshot Set Iterator - Databricks (medium)
  • Find the Best Commute Mode - Databricks (medium)
  • Partition a Target String by Source Substrings - Databricks (medium)