PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in stream processing, stateful sessionization, time-based aggregation, and per-user metric computation, focusing on competencies such as tracking user activity across unbounded ordered event streams and computing top-channel aggregates.

  • hard
  • Discord
  • Coding & Algorithms
  • Software Engineer

Implement User Sessionization From Event Stream

Company: Discord

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

You are building a backend data-processing component that consumes a stream of newline-delimited JSON events. Each event represents a user sending a message to a channel and has the following shape: ```json {"event_name":"send_message","timestamp":"2016-11-08T14:09:57Z","user_id":"1","channel_id":"1"} ``` The input events are guaranteed to arrive in chronological order. There are no missing, duplicated, or corrupt events. However, you must not assume the stream is finite, so your solution cannot depend on logic that runs only after reaching the end of input. Define a **user session** as a sequence of events for the same `user_id` where each event occurs within **30 minutes, inclusive**, of that user's previous event. If the gap between two consecutive events for the same user is greater than 30 minutes, the previous session ends and a new session begins. Implement a program or component that parses the event stream and emits a session record whenever a user session can be finalized. Each emitted session must contain: - `user_id` - `session_start_ts`: timestamp of the first event in the session - `session_end_ts`: timestamp of the last event in the session - `messages_sent`: total number of messages in the session - `top_channel_id`: the channel that received the most messages during the session - `top_channel_messages_sent`: number of messages sent to `top_channel_id` The exact output format is flexible as long as all required fields are present. If multiple channels tie for the most messages, document and consistently apply a tie-breaking rule. Example input lines: ```json {"event_name":"send_message","timestamp":"2016-11-08T14:09:57Z","user_id":"1","channel_id":"1"} {"event_name":"send_message","timestamp":"2016-11-08T14:10:01Z","user_id":"1","channel_id":"1"} {"event_name":"send_message","timestamp":"2016-11-08T14:10:07Z","user_id":"2","channel_id":"1"} ``` Example output record: ```json { "user_id": "1", "session_start_ts": "2016-11-08T14:09:57Z", "session_end_ts": "2016-11-08T14:39:57Z", "messages_sent": 15, "top_channel_id": "3", "top_channel_messages_sent": 12 } ```

Quick Answer: This question evaluates proficiency in stream processing, stateful sessionization, time-based aggregation, and per-user metric computation, focusing on competencies such as tracking user activity across unbounded ordered event streams and computing top-channel aggregates.

You are given `event_lines`, a list of newline-delimited JSON strings representing a chronological prefix of a potentially infinite event stream. Each event has the shape `{\"event_name\":\"send_message\",\"timestamp\":\"...\",\"user_id\":\"...\",\"channel_id\":\"...\"}`. Build a streaming sessionizer. A user session is a maximal sequence of events for the same `user_id` such that the gap between consecutive events for that user is at most 30 minutes, inclusive. If the gap is greater than 30 minutes, the old session ends and a new one begins. Because the stream may continue forever, you must not emit sessions just because the provided list ends. A session can be emitted only when it is provably closed while processing the stream. Concretely, before processing an event at time `T`, any active session whose last event time `L` satisfies `L + 30 minutes < T` must be emitted. Return the emitted session records in the exact order they would be produced by the streaming processor. If multiple sessions become emit-ready at the same moment, emit them in lexicographically increasing `user_id` order. Each emitted record must contain: - `user_id` - `session_start_ts` (timestamp of the first event in the session) - `session_end_ts` (timestamp of the last event in the session) - `messages_sent` - `top_channel_id` - `top_channel_messages_sent` If multiple channels tie for most messages inside a session, choose the lexicographically smallest `channel_id`.

Constraints

  • 0 <= len(event_lines) <= 200000
  • Each event line is valid JSON and `event_name` is always `send_message`
  • Timestamps use UTC ISO-8601 format: `YYYY-MM-DDTHH:MM:SSZ`
  • Input events are sorted by non-decreasing timestamp
  • Do not emit sessions that are still active after the last provided event

Examples

Input: []

Expected Output: []

Explanation: No events means no emitted sessions.

Input: ['{"event_name":"send_message","timestamp":"2016-11-08T10:00:00Z","user_id":"1","channel_id":"7"}']

Expected Output: []

Explanation: A single event starts a session, but there is no later timestamp proving that session has ended.

Input: ['{"event_name":"send_message","timestamp":"2016-11-08T10:00:00Z","user_id":"1","channel_id":"2"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:30:00Z","user_id":"1","channel_id":"1"}', '{"event_name":"send_message","timestamp":"2016-11-08T11:01:00Z","user_id":"2","channel_id":"9"}']

Expected Output: [{'user_id': '1', 'session_start_ts': '2016-11-08T10:00:00Z', 'session_end_ts': '2016-11-08T10:30:00Z', 'messages_sent': 2, 'top_channel_id': '1', 'top_channel_messages_sent': 1}]

Explanation: The two user 1 events are exactly 30 minutes apart, so they stay in the same session. When the 11:01 event arrives, user 1's last event is more than 30 minutes old, so that session is emitted. Channels 1 and 2 tie with one message each, so channel '1' wins by lexicographic tie-break.

Input: ['{"event_name":"send_message","timestamp":"2016-11-08T10:00:00Z","user_id":"1","channel_id":"2"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:05:00Z","user_id":"2","channel_id":"1"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:10:00Z","user_id":"1","channel_id":"3"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:40:00Z","user_id":"2","channel_id":"1"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:41:00Z","user_id":"3","channel_id":"9"}', '{"event_name":"send_message","timestamp":"2016-11-08T11:11:00Z","user_id":"3","channel_id":"9"}']

Expected Output: [{'user_id': '2', 'session_start_ts': '2016-11-08T10:05:00Z', 'session_end_ts': '2016-11-08T10:05:00Z', 'messages_sent': 1, 'top_channel_id': '1', 'top_channel_messages_sent': 1}, {'user_id': '1', 'session_start_ts': '2016-11-08T10:00:00Z', 'session_end_ts': '2016-11-08T10:10:00Z', 'messages_sent': 2, 'top_channel_id': '2', 'top_channel_messages_sent': 1}, {'user_id': '2', 'session_start_ts': '2016-11-08T10:40:00Z', 'session_end_ts': '2016-11-08T10:40:00Z', 'messages_sent': 1, 'top_channel_id': '1', 'top_channel_messages_sent': 1}]

Explanation: User 2's 10:05 session is emitted when 10:40 arrives. User 1's session is emitted when 10:41 arrives. User 2's 10:40 singleton session is emitted when 11:11 arrives. User 3 is still active at the end and must not be emitted.

Input: ['{"event_name":"send_message","timestamp":"2016-11-08T10:00:00Z","user_id":"1","channel_id":"5"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:00:00Z","user_id":"2","channel_id":"4"}', '{"event_name":"send_message","timestamp":"2016-11-08T10:31:00Z","user_id":"3","channel_id":"1"}']

Expected Output: [{'user_id': '1', 'session_start_ts': '2016-11-08T10:00:00Z', 'session_end_ts': '2016-11-08T10:00:00Z', 'messages_sent': 1, 'top_channel_id': '5', 'top_channel_messages_sent': 1}, {'user_id': '2', 'session_start_ts': '2016-11-08T10:00:00Z', 'session_end_ts': '2016-11-08T10:00:00Z', 'messages_sent': 1, 'top_channel_id': '4', 'top_channel_messages_sent': 1}]

Explanation: Both user 1 and user 2 become finalizable when the 10:31 event arrives. Since they expire at the same time, emit them in lexicographically increasing `user_id` order.

Input: ['{"event_name":"send_message","timestamp":"2016-11-08T14:00:00Z","user_id":"1","channel_id":"1"}', '{"event_name":"send_message","timestamp":"2016-11-08T14:05:00Z","user_id":"1","channel_id":"2"}', '{"event_name":"send_message","timestamp":"2016-11-08T14:20:00Z","user_id":"1","channel_id":"2"}', '{"event_name":"send_message","timestamp":"2016-11-08T14:51:00Z","user_id":"2","channel_id":"9"}']

Expected Output: [{'user_id': '1', 'session_start_ts': '2016-11-08T14:00:00Z', 'session_end_ts': '2016-11-08T14:20:00Z', 'messages_sent': 3, 'top_channel_id': '2', 'top_channel_messages_sent': 2}]

Explanation: User 1's session becomes provably closed when the 14:51 event arrives. Channel '2' received two of the three messages, so it is the top channel.

Hints

  1. A session is only guaranteed to be closed when you see a later timestamp strictly greater than `last_event_time + 30 minutes`. Equality is not enough, because an event exactly 30 minutes later still belongs to the same session.
  2. Use a hash map to store each user's active session, and a min-heap of candidate expiry times so you can emit old sessions without scanning every active user on each event.
Last updated: Jun 6, 2026

Related Coding Questions

  • Implement Game Metadata Lookups - Discord (hard)
  • Implement an asyncio-based chat server - Discord (medium)

Loading coding console...

PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.