How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at Google.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Google during technical interviews.

Design a logger that returns top-K messages

Q: Design a logger that returns top-K messages

This question evaluates skills in designing in-memory data structures and algorithms for frequency counting and interval querying, including extracting top-K elements with a heap, within the Coding & Algorithms domain.

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

## Problem Design an in-memory logging component that supports recording log events and querying the **top K most frequent messages**. Each log event has: - `timestamp` (integer seconds) - `message` (string) Implement the following operations: 1. `log(timestamp, message)`: record a new log event. 2. `topK(startTime, endTime, k) -> List[str]`: return up to `k` messages with the highest frequency among events with `startTime <= timestamp <= endTime`. - If there is a tie in frequency, break ties by lexicographically smaller `message` first (or state and apply a consistent tie-break rule). - If there are fewer than `k` distinct messages in the interval, return all of them. ### Requirements - You may assume timestamps are non-decreasing across calls to `log`. - Use a heap in your approach for extracting the top K. ### Constraints (reasonable interview assumptions) - Up to `2 * 10^5` total `log()` calls - Up to `2 * 10^5` total queries - `k` can be up to the number of distinct messages in the queried interval ### Example Logs: - `log(1, "A")`, `log(2, "B")`, `log(3, "A")`, `log(10, "C")` Query: - `topK(1, 3, 2)` → `["A", "B"]` (A appears 2 times, B appears 1 time)

Quick Answer: This question evaluates skills in designing in-memory data structures and algorithms for frequency counting and interval querying, including extracting top-K elements with a heap, within the Coding & Algorithms domain.

You are given a batched version of an in-memory logger. The array `logs` represents calls to `log(timestamp, message)` in the order they happened, where timestamps are non-decreasing. For each query `(startTime, endTime, k)`, return up to `k` messages with the highest frequency among log events whose timestamps satisfy `startTime <= timestamp <= endTime`. If two messages have the same frequency, the lexicographically smaller message must come first. If fewer than `k` distinct messages appear in the interval, return all of them. If no log falls in the interval, return an empty list for that query. Implement `solution(logs, queries)` and return the answers for all queries in order.

Constraints

0 <= len(logs) <= 20000
0 <= len(queries) <= 10000
Logs are given in non-decreasing order of timestamp
There are at most 2000 distinct messages across all logs
For each query, 1 <= k, and if the interval contains fewer than k distinct messages, return all of them

Examples

Input: ([(1, 'A'), (2, 'B'), (3, 'A'), (10, 'C')], [(1, 3, 2), (1, 10, 3)])

Expected Output: [['A', 'B'], ['A', 'B', 'C']]

Explanation: In [1, 3], A appears 2 times and B appears 1 time. In [1, 10], A appears 2 times, while B and C appear once each, so B comes before C by lexicographic order.

Input: ([(1, 'dog'), (2, 'cat'), (2, 'dog'), (4, 'cat'), (5, 'ant')], [(1, 4, 2)])

Expected Output: [['cat', 'dog']]

Explanation: Between timestamps 1 and 4 inclusive, cat appears 2 times and dog appears 2 times. The tie is broken lexicographically, so cat comes first.

Input: ([(1, 'x'), (5, 'y')], [(2, 4, 3), (1, 5, 5)])

Expected Output: [[], ['x', 'y']]

Explanation: The first query range contains no logs. In the second query, x and y both appear once, and k is larger than the number of distinct messages.

Input: ([], [(1, 10, 2)])

Expected Output: [[]]

Explanation: There are no log events, so every query returns an empty list.

Input: ([(1, 'm'), (1, 'm'), (2, 'n'), (3, 'm'), (3, 'n')], [(1, 1, 2), (3, 3, 1), (1, 3, 2)])

Expected Output: [['m'], ['m'], ['m', 'n']]

Explanation: This checks duplicate timestamps and inclusive boundaries. At time 1, only m appears. At time 3, m and n both appear once, so m wins the tie. Across [1, 3], m appears 3 times and n appears 2 times.

Input: ([(1, 'b'), (2, 'a'), (3, 'b'), (4, 'a'), (5, 'c'), (6, 'a')], [(1, 6, 2), (2, 5, 3)])

Expected Output: [['a', 'b'], ['a', 'b', 'c']]

Explanation: Across all logs, a appears 3 times, b 2 times, and c 1 time. In [2, 5], a appears 2 times while b and c appear once each, so b comes before c lexicographically.

Hints

Store all timestamps for each message in a hash map. Then use binary search to count how many times that message appears inside a query range.
After computing the counts for a query, put `(-count, message)` into a heap so the most frequent message comes out first, and lexicographically smaller messages win ties.

Quick Overview

This question evaluates skills in designing in-memory data structures and algorithms for frequency counting and interval querying, including extracting top-K elements with a heap, within the Coding & Algorithms domain.