How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Take-home Project rounds at Netflix.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Netflix during technical interviews.

Group users by same last K watched movies

Company: Netflix

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Take-home Project

You are given a list of users' movie-watching histories. Each user’s history is an ordered list of movie IDs in the order they were watched (earliest to latest). Assume: - `histories` is a dictionary/map from `user_id` (string or integer) to a list of `movie_id` (string or integer). - Each list has at least `K` movies. **Task** Write an algorithm to group users such that, within each group, all users share the exact same set of the last `K` movies they watched (ignoring order within those last `K` movies). Two users belong to the same group if the following holds: - Let `S(u)` be the set of the last `K` movies in user `u`'s history. - Users `u1` and `u2` are in the same group if `S(u1) = S(u2)`. **Input** - An integer `K` (1 ≤ K ≤ 10^3). - A map/dictionary `histories: user_id → [movie_id1, movie_id2, ..., movie_idN]` where N ≥ K for each user. **Output** - Return a list of groups, where each group is a list of `user_id`s. Only groups of size ≥ 2 (i.e., at least two users with the same last-K set) need to be returned. **Example** Suppose K = 3 and: - user A: [m1, m2, m3, m4, m5] - user B: [m7, m3, m4, m5] - user C: [m2, m3, m4, m6] Then: - Last 3 for A: {m3, m4, m5} - Last 3 for B: {m3, m4, m5} - Last 3 for C: {m3, m4, m6} The output should group A and B together, and C can be omitted or appear alone depending on your choice, but only multi-user groups are required in the output. Describe the algorithm and its time and space complexity.

Quick Answer: This question evaluates competency in set operations, grouping via maps/dictionaries, and algorithmic complexity analysis when comparing suffixes of ordered lists.

You are given a dictionary `histories` mapping each `user_id` to an ordered list of watched `movie_id`s from earliest to latest. For each user, look only at their last `K` watched movies and convert those `K` movies into a set, so order does not matter and duplicates are ignored. Group users together if these last-`K` sets are exactly equal. Return only groups with at least 2 users. If no such groups exist, return an empty list. For deterministic output, sort each group by the string representation of its user IDs, and sort the final list of groups lexicographically by those sorted groups.

Constraints

0 <= number of users <= 10^5
1 <= K <= 10^3
For every user, len(history) >= K
user_id and movie_id are hashable values such as strings or integers
Movie IDs may repeat inside a user's history

Examples

Input: (3, {"A": ["m1", "m2", "m3", "m4", "m5"], "B": ["m7", "m3", "m4", "m5"], "C": ["m2", "m3", "m4", "m6"]})

Expected Output: [['A', 'B']]

Explanation: A and B both have last-3 set {m3, m4, m5}. C has {m3, m4, m6}, so it is not grouped.

Input: (2, {1: [5, 2, 3], 2: [8, 3, 2], 3: [1, 6, 7], 4: [7, 6], 5: [9, 8]})

Expected Output: [[1, 2], [3, 4]]

Explanation: Users 1 and 2 share the set {2, 3}. Users 3 and 4 share the set {6, 7}. User 5 is alone.

Input: (3, {101: [1, 2, 2], 102: [2, 1, 1], 103: [1, 3, 3], 104: [9, 8, 7]})

Expected Output: [[101, 102]]

Explanation: The last 3 movies for 101 and 102 become the same set {1, 2} after ignoring order and duplicates.

Input: (2, {})

Expected Output: []

Explanation: With no users, there are no groups to return.

Input: (1, {"u1": ["x", "y"], "u2": ["z", "y"], "u3": ["y", "x"], "u4": ["x"]})

Expected Output: [['u1', 'u2'], ['u3', 'u4']]

Explanation: For K = 1, users are grouped by their single most recent movie: u1 and u2 watched y last, while u3 and u4 watched x last.

Solution

def solution(K, histories):
    groups = {}

    for user_id, watched in histories.items():
        key = frozenset(watched[-K:])
        if key not in groups:
            groups[key] = []
        groups[key].append(user_id)

    result = []
    for users in groups.values():
        if len(users) >= 2:
            users.sort(key=lambda x: str(x))
            result.append(users)

    result.sort(key=lambda group: tuple(str(x) for x in group))
    return result

Time complexity: O(U * K + U log U), where U is the number of users. Building each last-K set takes O(K), and the deterministic output sorting costs up to O(U log U) in total.. Space complexity: O(U * K) in the worst case for storing grouping keys and user buckets..

Hints

Since order within the last K movies does not matter, think about converting those movies into an order-independent hashable key.
Use a dictionary to collect all users that share the same last-K movie set, then keep only buckets with at least two users.

Quick Overview

This question evaluates competency in set operations, grouping via maps/dictionaries, and algorithmic complexity analysis when comparing suffixes of ordered lists.