PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates understanding of time-series data structures and streaming counters, testing algorithm design, space/time complexity analysis, correctness across edge cases, and handling of large-scale event tracking.

  • Medium
  • Databricks
  • Coding & Algorithms
  • Software Engineer

Design a rolling event tracker with ranges

Company: Databricks

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Design a rolling event tracker that supports time-based queries. Implement a data structure with: ( 1) record(timestamp): record one event at integer timestamp (seconds since epoch). Assume timestamps for record are non-decreasing. ( 2) count(windowSeconds, currentTimestamp): return the number of events in the inclusive interval [currentTimestamp - windowSeconds + 1, currentTimestamp]. ( 3) countRange(startTimestamp, endTimestamp): return the number of events in the inclusive interval [startTimestamp, endTimestamp]. Follow-ups: (a) Prevent counter overflow under very high traffic; discuss data types, rollover/saturation strategies, and how to keep correctness across boundaries. (b) Propose a bucketing/chunking scheme (e.g., per-second or per-minute circular buffer) to bound memory; analyze time/space complexity and, if you compress buckets, quantify any accuracy trade-offs. (c) Support efficient queries for long ranges (hours to days); compare designs (queues, circular buffers, prefix sums, Fenwick/segment trees) and explain update/query costs. (d) Explain eviction/expiry of old buckets and how sliding windows are maintained when the requested window exceeds the in-memory buffer. (e) Provide complexity analysis and key test cases covering edge conditions (empty ranges, single-point ranges, large windows, and near-integer-limit counts).

Quick Answer: This question evaluates understanding of time-series data structures and streaming counters, testing algorithm design, space/time complexity analysis, correctness across edge cases, and handling of large-scale event tracking.

Implement the coding portion of a rolling event tracker. You are given a sequence of operations that must be processed in order. Support: 1) ("record", timestamp): record one event at the given integer timestamp. 2) ("count", windowSeconds, currentTimestamp): return the number of recorded events in the inclusive interval [currentTimestamp - windowSeconds + 1, currentTimestamp]. 3) ("countRange", startTimestamp, endTimestamp): return the number of recorded events in the inclusive interval [startTimestamp, endTimestamp]. Timestamps used in record operations are guaranteed to be non-decreasing. Query timestamps may be large, so building a dense array over time is not allowed. If startTimestamp > endTimestamp, treat the range as empty and return 0. For the coding part, return the answers for all query operations in order. In a real interview follow-up, you should also be ready to discuss overflow handling, bounded-memory bucketing, and trade-offs for very long-range queries.

Constraints

  • 1 <= len(operations) <= 200000
  • 0 <= timestamp, currentTimestamp, startTimestamp, endTimestamp <= 10^18
  • Timestamps in record operations are non-decreasing
  • 1 <= windowSeconds <= 10^18 for count operations
  • Multiple events may occur at the same timestamp

Examples

Input: ([("record", 10), ("record", 10), ("record", 12), ("count", 1, 10), ("count", 3, 12), ("countRange", 11, 12)],)

Expected Output: [2, 3, 1]

Explanation: There are two events at 10 and one at 12. [10,10] has 2 events, [10,12] has 3 events, and [11,12] has only the event at 12.

Input: ([("count", 5, 100), ("countRange", 7, 7), ("record", 8), ("countRange", 9, 8), ("count", 1, 8)],)

Expected Output: [0, 0, 0, 1]

Explanation: Queries before any record return 0. The range [9,8] is empty because start > end. After recording one event at 8, the 1-second window ending at 8 contains 1 event.

Input: ([("record", 1), ("record", 2), ("record", 2), ("record", 5), ("count", 4, 5), ("count", 10, 10), ("countRange", 2, 2), ("countRange", 3, 4)],)

Expected Output: [3, 4, 2, 0]

Explanation: The window [2,5] contains timestamps 2, 2, and 5. The large window [1,10] includes all 4 recorded events. The single-point range [2,2] contains 2 events, while [3,4] contains none.

Input: ([("record", 999999999999), ("record", 1000000000000), ("record", 1000000000000), ("count", 2, 1000000000000), ("countRange", 1000000000000, 1000000000000), ("countRange", 999999999998, 999999999998)],)

Expected Output: [3, 2, 0]

Explanation: This checks very large timestamps. The 2-second window ending at 1000000000000 covers 999999999999 and 1000000000000, so it includes all 3 recorded events.

Input: ([("record", 3), ("count", 5, 3), ("record", 7), ("countRange", 1, 6), ("record", 7), ("count", 1, 7), ("count", 3, 8)],)

Expected Output: [1, 1, 2, 2]

Explanation: The first query sees only the event at 3. The range [1,6] excludes later events at 7. After two events are recorded at 7, both the single-second query at 7 and the 3-second window ending at 8 return 2.

Input: ([("record", 1), ("record", 1), ("record", 2)],)

Expected Output: []

Explanation: There are no query operations, so the returned list is empty.

Solution

def solution(operations):
    from bisect import bisect_left, bisect_right

    times = []
    prefix = []
    results = []

    def range_count(start, end):
        if end < start or not times:
            return 0

        left = bisect_left(times, start)
        right = bisect_right(times, end) - 1

        if left >= len(times) or right < left:
            return 0

        return prefix[right] - (prefix[left - 1] if left > 0 else 0)

    for op in operations:
        kind = op[0]

        if kind == "record":
            t = op[1]
            if times and times[-1] == t:
                prefix[-1] += 1
            else:
                times.append(t)
                prefix.append((prefix[-1] if prefix else 0) + 1)

        elif kind == "count":
            window_seconds, current_timestamp = op[1], op[2]
            if window_seconds <= 0:
                results.append(0)
            else:
                start = current_timestamp - window_seconds + 1
                results.append(range_count(start, current_timestamp))

        elif kind == "countRange":
            start_timestamp, end_timestamp = op[1], op[2]
            results.append(range_count(start_timestamp, end_timestamp))

        else:
            raise ValueError("Unknown operation type")

    return results

Time complexity: record: O(1) amortized; count/countRange: O(log m), where m is the number of distinct recorded timestamps. Space complexity: O(m), where m is the number of distinct recorded timestamps.

Hints

  1. Because record timestamps never decrease, you can keep timestamps in sorted order without inserting in the middle. If the same timestamp repeats, merge it into the last bucket.
  2. To answer an inclusive range query quickly, find the first recorded timestamp >= start and the last recorded timestamp <= end, then use cumulative totals.
Last updated: May 3, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Find the Best Commute Mode - Databricks (medium)
  • Partition a Target String by Source Substrings - Databricks (medium)
  • Design In-Memory QPS Counter - Databricks (medium)
  • Implement Random Connectivity and Grid Routing - Databricks
  • Implement a Snapshot Set Iterator - Databricks (hard)