PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's competency in organizing and querying collections, mapping relationships between entities, and reasoning about algorithmic time and space complexity within data structures and grouping operations.

  • Medium
  • Palantir
  • Coding & Algorithms
  • Software Engineer

Group employees by shared interests

Company: Palantir

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Given N employees, each with an integer id and a set of interests (strings), write an algorithm to find all employees who share at least one common interest. Return either (a) a mapping from each interest to the sorted list of employee ids who have it, and/or (b) the list of all unique employee-id pairs (u, v) that share one or more interests. Analyze time and space complexity and provide code in your preferred language.

Quick Answer: This question evaluates a candidate's competency in organizing and querying collections, mapping relationships between entities, and reasoning about algorithmic time and space complexity within data structures and grouping operations.

Part 1: Map each interest to the sorted employee IDs

You are given a list of employees. Each employee is represented by an integer ID and a list of interest strings. Build an inverted index that maps every interest to the sorted list of employee IDs who have that interest. Treat each employee's interest list as a set, so repeated interests for the same employee should only count once.

Constraints

  • 0 <= N <= 100000, where N is the number of employees
  • Employee IDs are unique integers
  • The total number of interest strings across all employees is at most 200000
  • Each interest string has length between 1 and 50
  • Duplicate interests within the same employee should be treated as one interest

Examples

Input: [(101, ['music', 'reading']), (102, ['sports', 'music']), (103, ['reading']), (104, [])]

Expected Output: {'music': [101, 102], 'reading': [101, 103], 'sports': [102]}

Explanation: Employees 101 and 102 share 'music', employees 101 and 103 have 'reading', and only 102 has 'sports'. Employee 104 contributes nothing.

Input: []

Expected Output: {}

Explanation: No employees means no interests in the result.

Input: [(5, ['gaming', 'gaming', 'chess'])]

Expected Output: {'chess': [5], 'gaming': [5]}

Explanation: Repeated 'gaming' for the same employee counts once.

Input: [(7, ['art']), (3, ['art', 'music']), (5, ['music'])]

Expected Output: {'art': [3, 7], 'music': [3, 5]}

Explanation: Employee IDs must be sorted within each interest list.

Input: [(1, []), (2, [])]

Expected Output: {}

Explanation: Employees with no interests do not appear in any mapping.

Solution

def solution(employees):
    interest_to_ids = {}

    for emp_id, interests in employees:
        for interest in set(interests):
            if interest not in interest_to_ids:
                interest_to_ids[interest] = []
            interest_to_ids[interest].append(emp_id)

    result = {}
    for interest in sorted(interest_to_ids):
        result[interest] = sorted(interest_to_ids[interest])

    return result

Time complexity: O(T + sum(m_i log m_i) + U log U), where T is the total number of input interest strings, m_i is the number of employees for interest i, and U is the number of unique interests.. Space complexity: O(T), excluding the input; this stores the inverted index and output..

Hints

  1. Think about reversing the relationship: instead of employee -> interests, build interest -> employees.
  2. If an employee's list may contain duplicates, convert it to a set before adding the employee ID to each interest bucket.

Part 2: Return all unique employee-ID pairs that share an interest

You are given a list of employees. Each employee has an integer ID and a list of interest strings. Return all unique employee-ID pairs (u, v) such that u and v share at least one common interest. Each pair must appear only once even if the two employees share multiple interests. Return each pair with u < v, and sort the final list of pairs in ascending lexicographic order.

Constraints

  • 0 <= N <= 5000, where N is the number of employees
  • Employee IDs are unique integers
  • The total number of interest strings across all employees is at most 20000
  • Duplicate interests within the same employee should be treated as one interest
  • The total number of generated candidate pairs across all interests is at most 200000

Examples

Input: [(1, ['music', 'sports']), (2, ['sports', 'cooking']), (3, ['music']), (4, ['travel'])]

Expected Output: [(1, 2), (1, 3)]

Explanation: Employees 1 and 2 share 'sports', and employees 1 and 3 share 'music'. Employee 4 shares nothing with others.

Input: []

Expected Output: []

Explanation: No employees means no shared-interest pairs.

Input: [(42, ['reading'])]

Expected Output: []

Explanation: A single employee cannot form a pair.

Input: [(10, ['a', 'b', 'b']), (7, ['b', 'c']), (9, ['a', 'c']), (12, [])]

Expected Output: [(7, 9), (7, 10), (9, 10)]

Explanation: Pair (9, 10) shares 'a', pair (7, 10) shares 'b', and pair (7, 9) shares 'c'. Duplicate 'b' for employee 10 counts once.

Input: [(1, ['a', 'b']), (2, ['a', 'b']), (3, ['c'])]

Expected Output: [(1, 2)]

Explanation: Employees 1 and 2 share two interests, but their pair appears only once.

Solution

def solution(employees):
    interest_to_ids = {}

    for emp_id, interests in employees:
        for interest in set(interests):
            if interest not in interest_to_ids:
                interest_to_ids[interest] = set()
            interest_to_ids[interest].add(emp_id)

    pairs = set()

    for ids in interest_to_ids.values():
        sorted_ids = sorted(ids)
        for i in range(len(sorted_ids)):
            for j in range(i + 1, len(sorted_ids)):
                pairs.add((sorted_ids[i], sorted_ids[j]))

    return sorted(pairs)

Time complexity: O(T + sum(k_i log k_i + C(k_i, 2)) + P log P), where T is the total number of input interest strings, k_i is the number of employees with interest i, and P is the number of unique output pairs.. Space complexity: O(T + P), excluding the input; this stores the interest groups and the unique output pairs..

Hints

  1. First group employees by interest, then generate employee pairs inside each interest group.
  2. Use a set of pairs so that if two employees share multiple interests, the pair is still returned only once.
Last updated: Jun 4, 2026

Related Coding Questions

  • Find Shortest Paths in Road Network - Palantir (easy)
  • Solve two algorithm problems from a menu - Palantir (Medium)

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.