PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question tests practical application of hash-based data structures and stream processing to track and rank user activity with deduplication logic. It evaluates competency in handling edge cases, maintaining sorted order under constraints, and extending solutions to support top-K queries and live updates — core skills assessed in software engineering coding interviews.

  • medium
  • Google
  • Coding & Algorithms
  • Software Engineer

Most Active Users in a Live Communication Stream

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

# Most Active Users in a Live Communication Stream You are processing a continuous stream of pairings between users. Each event is an unordered pair `(u, v)` meaning users `u` and `v` have just established an **active communication** with each other. Two users form **at most one** active communication: re-receiving a pair that already exists (in either order, e.g. `(A, B)` after `(B, A)`) must be ignored and must not be double-counted. A user's **communication count** is the number of distinct other users they currently have an active communication with. After processing the stream, return the users ordered by communication count in **descending** order. Break ties by user id in ascending (lexicographic) order. ## Input - `events`: a list of pairs `[u, v]`, where each `u` and `v` is a non-empty string user id. Pairs arrive in the given order. ## Output - A list of user ids, sorted by communication count descending, then by user id ascending. Only include users that appear in at least one valid communication. ## Rules - A pair `(u, v)` and its reverse `(v, u)` denote the **same** communication; count it once. - A duplicate of an already-seen pair contributes nothing. - Self-pairs where `u == v` are **invalid input** and must be skipped (they create no communication). ## Example ``` Input: events = [["A","B"], ["B","C"], ["B","A"], ["C","D"], ["A","A"]] Output: ["B", "C", "A", "D"] ``` Explanation: After ignoring the duplicate `["B","A"]` and the self-pair `["A","A"]`, the distinct communications are A-B, B-C, C-D. Counts: B has 2 (A, C); A has 1 (B); C has 2 (B, D); D has 1 (C). Sorted by count descending then id ascending: B (2), C (2), A (1), D (1). ## Constraints - `1 <= len(events) <= 2 * 10^5` - Each user id has length between 1 and 50. - Total distinct users fits in memory. ## Follow-ups to consider in your design - **Top-K only:** Instead of returning all users, return only the `K` users with the highest communication counts (same tie-break rule). - **Live queries:** Support an interleaved query that asks, at any point in the stream, for the current most-active user — your structure should answer this without re-sorting from scratch each time.

Quick Answer: This question tests practical application of hash-based data structures and stream processing to track and rank user activity with deduplication logic. It evaluates competency in handling edge cases, maintaining sorted order under constraints, and extending solutions to support top-K queries and live updates — core skills assessed in software engineering coding interviews.

You are processing a continuous stream of pairings between users. Each event is an unordered pair `(u, v)` meaning users `u` and `v` have just established an **active communication** with each other. Two users form **at most one** active communication: re-receiving a pair that already exists (in either order, e.g. `(A, B)` after `(B, A)`) must be ignored and must not be double-counted. A user's **communication count** is the number of distinct other users they currently have an active communication with. After processing the stream, return the users ordered by communication count in **descending** order. Break ties by user id in ascending (lexicographic) order. ## Input - `events`: a list of pairs `[u, v]`, where each `u` and `v` is a non-empty string user id. Pairs arrive in the given order. ## Output - A list of user ids, sorted by communication count descending, then by user id ascending. Only include users that appear in at least one valid communication. ## Rules - A pair `(u, v)` and its reverse `(v, u)` denote the **same** communication; count it once. - A duplicate of an already-seen pair contributes nothing. - Self-pairs where `u == v` are **invalid input** and must be skipped (they create no communication). ## Example ``` Input: events = [["A","B"], ["B","C"], ["B","A"], ["C","D"], ["A","A"]] Output: ["B", "C", "A", "D"] ``` Explanation: After ignoring the duplicate `["B","A"]` and the self-pair `["A","A"]`, the distinct communications are A-B, B-C, C-D. Counts: B has 2 (A, C); A has 1 (B); C has 2 (B, D); D has 1 (C). Sorted by count descending then id ascending: B (2), C (2), A (1), D (1). ## Constraints - `1 <= len(events) <= 2 * 10^5` - Each user id has length between 1 and 50. - Total distinct users fits in memory.

Constraints

  • 1 <= len(events) <= 2 * 10^5
  • Each user id has length between 1 and 50.
  • Total distinct users fits in memory.
  • A pair (u, v) and its reverse (v, u) are the same communication; count it once.
  • Self-pairs where u == v are invalid and must be skipped.

Examples

Input: ([["A","B"], ["B","C"], ["B","A"], ["C","D"], ["A","A"]],)

Expected Output: ["B", "C", "A", "D"]

Explanation: Worked example: duplicate [B,A] and self-pair [A,A] are ignored. Distinct communications A-B, B-C, C-D give counts B=2, C=2, A=1, D=1. Sorted count desc then id asc.

Input: ([["X","Y"]],)

Expected Output: ["X", "Y"]

Explanation: A single communication: both X and Y have count 1, so they sort by id ascending.

Input: ([["A","A"], ["B","B"]],)

Expected Output: []

Explanation: All events are self-pairs, which are invalid and create no communication, so no user qualifies — empty output.

Input: ([["u1","u2"], ["u2","u1"], ["u1","u2"]],)

Expected Output: ["u1", "u2"]

Explanation: The same pair arrives three times (in both orders); it is counted once. Each user ends with count 1.

Input: ([["alice","bob"], ["bob","carol"], ["carol","alice"], ["dave","alice"]],)

Expected Output: ["alice", "bob", "carol", "dave"]

Explanation: Triangle alice-bob-carol plus dave-alice. Counts: alice=3 (bob,carol,dave), bob=2, carol=2, dave=1. alice leads; bob and carol tie at 2 and sort by id ascending; dave last.

Input: ([["z","a"], ["z","b"], ["y","a"]],)

Expected Output: ["a", "z", "b", "y"]

Explanation: Counts: a=2 (z,y), z=2 (a,b), b=1, y=1. The top two tie at 2 — a before z by id; the bottom two tie at 1 — b before y by id.

Hints

  1. Canonicalize each pair (e.g. sort the two ids) so (A,B) and (B,A) map to the same key, then store those keys in a set to dedupe.
  2. Maintain an adjacency map from user id to a set of distinct neighbors; the communication count is just the size of that set.
  3. Sort the users with a composite key: communication count descending, then user id ascending. In most languages this means negating the count (or a custom comparator) so the primary sort is descending while the tie-break stays ascending.
Last updated: Jun 24, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Infection Spread on a Grid (Cellular Automaton) - Google (hard)
  • Boolean Expression Tree with Leaf Flips - Google (medium)
  • Streaming Points: Remove Any Pair Within a Distance - Google (medium)
  • Solve Rooms and Top-K Streams - Google (medium)
  • Find Containing Range - Google (medium)