PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates skills in designing in-memory data structures and algorithms for frequency counting and interval querying, including extracting top-K elements with a heap, within the Coding & Algorithms domain.

  • medium
  • Google
  • Coding & Algorithms
  • Software Engineer

Design a logger that returns top-K messages

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

## Problem Design an in-memory logging component that supports recording log events and querying the **top K most frequent messages**. Each log event has: - `timestamp` (integer seconds) - `message` (string) Implement the following operations: 1. `log(timestamp, message)`: record a new log event. 2. `topK(startTime, endTime, k) -> List[str]`: return up to `k` messages with the highest frequency among events with `startTime <= timestamp <= endTime`. - If there is a tie in frequency, break ties by lexicographically smaller `message` first (or state and apply a consistent tie-break rule). - If there are fewer than `k` distinct messages in the interval, return all of them. ### Requirements - You may assume timestamps are non-decreasing across calls to `log`. - Use a heap in your approach for extracting the top K. ### Constraints (reasonable interview assumptions) - Up to `2 * 10^5` total `log()` calls - Up to `2 * 10^5` total queries - `k` can be up to the number of distinct messages in the queried interval ### Example Logs: - `log(1, "A")`, `log(2, "B")`, `log(3, "A")`, `log(10, "C")` Query: - `topK(1, 3, 2)` → `["A", "B"]` (A appears 2 times, B appears 1 time)

Quick Answer: This question evaluates skills in designing in-memory data structures and algorithms for frequency counting and interval querying, including extracting top-K elements with a heap, within the Coding & Algorithms domain.

You are given a batched version of an in-memory logger. The array `logs` represents calls to `log(timestamp, message)` in the order they happened, where timestamps are non-decreasing. For each query `(startTime, endTime, k)`, return up to `k` messages with the highest frequency among log events whose timestamps satisfy `startTime <= timestamp <= endTime`. If two messages have the same frequency, the lexicographically smaller message must come first. If fewer than `k` distinct messages appear in the interval, return all of them. If no log falls in the interval, return an empty list for that query. Implement `solution(logs, queries)` and return the answers for all queries in order.

Constraints

  • 0 <= len(logs) <= 20000
  • 0 <= len(queries) <= 10000
  • Logs are given in non-decreasing order of timestamp
  • There are at most 2000 distinct messages across all logs
  • For each query, 1 <= k, and if the interval contains fewer than k distinct messages, return all of them

Examples

Input: ([(1, 'A'), (2, 'B'), (3, 'A'), (10, 'C')], [(1, 3, 2), (1, 10, 3)])

Expected Output: [['A', 'B'], ['A', 'B', 'C']]

Explanation: In [1, 3], A appears 2 times and B appears 1 time. In [1, 10], A appears 2 times, while B and C appear once each, so B comes before C by lexicographic order.

Input: ([(1, 'dog'), (2, 'cat'), (2, 'dog'), (4, 'cat'), (5, 'ant')], [(1, 4, 2)])

Expected Output: [['cat', 'dog']]

Explanation: Between timestamps 1 and 4 inclusive, cat appears 2 times and dog appears 2 times. The tie is broken lexicographically, so cat comes first.

Input: ([(1, 'x'), (5, 'y')], [(2, 4, 3), (1, 5, 5)])

Expected Output: [[], ['x', 'y']]

Explanation: The first query range contains no logs. In the second query, x and y both appear once, and k is larger than the number of distinct messages.

Input: ([], [(1, 10, 2)])

Expected Output: [[]]

Explanation: There are no log events, so every query returns an empty list.

Input: ([(1, 'm'), (1, 'm'), (2, 'n'), (3, 'm'), (3, 'n')], [(1, 1, 2), (3, 3, 1), (1, 3, 2)])

Expected Output: [['m'], ['m'], ['m', 'n']]

Explanation: This checks duplicate timestamps and inclusive boundaries. At time 1, only m appears. At time 3, m and n both appear once, so m wins the tie. Across [1, 3], m appears 3 times and n appears 2 times.

Input: ([(1, 'b'), (2, 'a'), (3, 'b'), (4, 'a'), (5, 'c'), (6, 'a')], [(1, 6, 2), (2, 5, 3)])

Expected Output: [['a', 'b'], ['a', 'b', 'c']]

Explanation: Across all logs, a appears 3 times, b 2 times, and c 1 time. In [2, 5], a appears 2 times while b and c appear once each, so b comes before c lexicographically.

Hints

  1. Store all timestamps for each message in a hash map. Then use binary search to count how many times that message appears inside a query range.
  2. After computing the counts for a query, put `(-count, message)` into a heap so the most frequent message comes out first, and lexicographically smaller messages win ties.
Last updated: May 23, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Solve Rooms and Top-K Streams - Google (medium)
  • Find Containing Range - Google (medium)
  • Rearrange Tasks With Cooldown - Google (medium)
  • Implement Employee Management and Expression Evaluation - Google (medium)
  • Solve Three Array and Matrix Path Problems - Google (medium)