PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in fundamental data structure manipulation and string-processing algorithms, specifically preserving insertion order during list deduplication and performing iterative overlap-based word merging.

  • Medium
  • Yahoo
  • Coding & Algorithms
  • Data Scientist

Solve Data Structure Challenges with Python Algorithms

Company: Yahoo

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Scenario You are given coding challenges in Python to manipulate simple data structures. ##### Question Remove duplicates from a list of integers while preserving the original order. 2. Given a list of lowercase words, iteratively merge them into one long word by overlapping the last character of the current word with the first character of the next word when they are the same. Example: ['abc','rgn','ctr'] ➜ 'abctrgn'. Handle edge cases such as no overlap or multiple possible overlaps. ##### Hints For Q1, collections.OrderedDict keeps insertion order; for Q2, simulate building the string and carefully update indices.

Quick Answer: This question evaluates proficiency in fundamental data structure manipulation and string-processing algorithms, specifically preserving insertion order during list deduplication and performing iterative overlap-based word merging.

You are given a list of non-empty lowercase words. Build a single string by consuming all words exactly once using this process: (1) Start with the first word in the list. (2) At each step, among the unused words, if there is any whose first character equals the current result's last character, choose the one that appears earliest in the original list and append it without duplicating the overlapping character (i.e., append word[1:]). (3) If no such word exists, append the earliest unused word in the original list as-is. (4) Continue until all words are used. Return the final string. If the list is empty, return an empty string.

Constraints

  • 0 <= n <= 200000 where n is the number of words
  • 1 <= len(words[i]) and words[i] consists only of lowercase letters 'a'..'z'
  • Sum of lengths of all words <= 200000
  • Only a 1-character overlap (last of current with first of next) is used
  • Tie-breaker: always pick the earliest valid unused word by original index

Examples

Input:

Expected Output: abctrgn

Input:

Expected Output: abcdef

Input:

Expected Output: axaxb

Input:

Expected Output: a

Input:

Expected Output: hello

Solution

from collections import deque

def merge_words(words):
    n = len(words)
    if n == 0:
        return ""

    # Map from starting character to queue of indices in original order
    char_to_indices = {chr(c): deque() for c in range(ord('a'), ord('z') + 1)}
    for i, w in enumerate(words):
        # words[i] are non-empty by constraints
        char_to_indices[w[0]].append(i)

    used = [False] * n
    res = words[0]
    used[0] = True
    used_count = 1

    # Pointer to earliest unused index for fallback
    p = 0
    while p < n and used[p]:
        p += 1

    last_char = res[-1]

    while used_count < n:
        dq = char_to_indices.get(last_char)
        chosen = None
        if dq is not None:
            # Discard used indices lazily
            while dq and used[dq[0]]:
                dq.popleft()
            if dq:
                chosen = dq.popleft()
                used[chosen] = True
                used_count += 1
                # Overlap by 1 character
                res += words[chosen][1:]
                last_char = words[chosen][-1]
                continue

        # No match: take earliest unused by index
        while p < n and used[p]:
            p += 1
        if p >= n:
            break
        chosen = p
        used[chosen] = True
        used_count += 1
        res += words[chosen]
        last_char = words[chosen][-1]

    return res
Explanation
We simulate the construction greedily while ensuring determinism via the earliest-index tie-breaker. For O(1) amortized selection of the next candidate, we maintain for each letter a queue (deque) of word indices that start with that letter in original order. At each step, we first try to pick the earliest unused index from the queue for the current last character; if none exists, we fall back to the earliest unused index overall (tracked with a moving pointer). We append only word[1:] when overlapping to avoid duplicating the shared character. Lazy removal of used indices from the deques keeps the total work linear.

Time complexity: O(n + total_length). Space complexity: O(n).

Hints

  1. Maintain, for each starting character, a queue of indices of words that start with that character.
  2. Track a pointer to the earliest unused index for the fallback case.
  3. When overlapping, append only word[1:] to avoid duplicating the shared character.
  4. Use a used[] array and lazily discard used indices from character queues.
Last updated: Mar 29, 2026

Related Coding Questions

  • Merge words by head-tail chaining - Yahoo (Medium)

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.