PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates the ability to design and implement an in-memory time-series counter and sliding-window aggregation, emphasizing efficient data structures, handling high-frequency events, and memory-time trade-offs.

  • medium
  • Databricks
  • Coding & Algorithms
  • Machine Learning Engineer

Design In-Memory QPS Counter

Company: Databricks

Role: Machine Learning Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Implement an in-memory traffic counter. You need to support two operations for each logical key, such as an endpoint, customer, or service name: - `record(key, timestamp)`: a request for `key` happened at `timestamp`. - `get_qps(key, now)`: return the average queries per second for `key` over the last 5 minutes, using the time window `[now - 299, now]`. Requirements: - All data must be stored in memory. - Timestamps are integer seconds. - Multiple requests may arrive with the same `key` and the same `timestamp`. - Start with a working solution for a 5-minute sliding window. Follow-up discussion: 1. How would you reduce memory usage? 2. How would you extend the design to support a 24-hour window? 3. How would you efficiently handle many requests that share the same timestamp? Assume calls arrive in non-decreasing timestamp order.

Quick Answer: This question evaluates the ability to design and implement an in-memory time-series counter and sliding-window aggregation, emphasizing efficient data structures, handling high-frequency events, and memory-time trade-offs.

Part 1: Implement an In-Memory QPS Counter for a 5-Minute Sliding Window

Implement a traffic counter that supports two operations for each logical key: - ('record', key, timestamp): one request happened for key at timestamp - ('get_qps', key, now): return the average queries per second for key over the last 5 minutes Use the inclusive sliding window [now - 299, now], which contains exactly 300 seconds. Multiple requests may have the same key and the same timestamp. Calls arrive in non-decreasing timestamp order. Write a function that processes a list of operations and returns the answer for every get_qps operation. Return each QPS value rounded to 6 decimal places.

Constraints

  • 0 <= len(operations) <= 200000
  • 0 <= timestamp, now <= 10^9
  • Operation timestamps are globally non-decreasing
  • Keys are strings
  • The QPS for a query is count_in_window / 300.0

Examples

Input: [('record', 'api', 1), ('record', 'api', 2), ('record', 'api', 300), ('get_qps', 'api', 300), ('get_qps', 'api', 301)]

Expected Output: [0.01, 0.006667]

Explanation: At now=300, window [1,300] has 3 requests. At now=301, window [2,301] has 2 requests.

Input: [('record', 'a', 10), ('record', 'b', 10), ('record', 'a', 11), ('get_qps', 'a', 11), ('get_qps', 'b', 11)]

Expected Output: [0.006667, 0.003333]

Explanation: Different keys must be tracked independently.

Input: [('record', 'x', 5), ('record', 'x', 5), ('record', 'x', 5), ('get_qps', 'x', 5)]

Expected Output: [0.01]

Explanation: Multiple requests at the same timestamp all count.

Input: []

Expected Output: []

Explanation: Edge case: no operations.

Input: [('get_qps', 'missing', 100)]

Expected Output: [0.0]

Explanation: Edge case: querying a key with no recorded traffic returns 0.

Hints

  1. For each key, keep only timestamps that are still inside the last 300-second window.
  2. Because timestamps never go backward, expired timestamps can be removed from the front of a queue.

Part 2: Reduce Memory Usage with a Fixed-Size Circular Buffer

Implement the same 5-minute QPS counter, but design it so memory usage per key is bounded. Support these operations: - ('record', key, timestamp) - ('get_qps', key, now) The window is still [now - 299, now]. Timestamps are integer seconds, calls are in non-decreasing timestamp order, and multiple requests may share the same key and timestamp. This time, use the fact that only the last 300 seconds matter. Write a function that processes the operations and returns the answer for each get_qps query, rounded to 6 decimal places.

Constraints

  • 0 <= len(operations) <= 200000
  • 0 <= timestamp, now <= 10^9
  • Operation timestamps are globally non-decreasing
  • Keys are strings
  • Target memory usage should be O(number_of_keys * 300)

Examples

Input: [('record', 'api', 1), ('record', 'api', 2), ('record', 'api', 300), ('get_qps', 'api', 300), ('get_qps', 'api', 301)]

Expected Output: [0.01, 0.006667]

Explanation: Same behavior as the basic version, but with fixed-size storage per key.

Input: [('record', 'api', 1), ('record', 'api', 301), ('get_qps', 'api', 301)]

Expected Output: [0.003333]

Explanation: Timestamp 1 falls out of window [2,301], so only timestamp 301 counts.

Input: [('record', 'x', 100), ('record', 'x', 100), ('get_qps', 'x', 100)]

Expected Output: [0.006667]

Explanation: The same second can be recorded multiple times.

Input: [('get_qps', 'x', 50)]

Expected Output: [0.0]

Explanation: Edge case: querying a key with no data.

Input: [('record', 'a', 1), ('record', 'b', 2), ('record', 'a', 300), ('get_qps', 'a', 300), ('get_qps', 'b', 300)]

Expected Output: [0.006667, 0.003333]

Explanation: Each key has its own circular buffer.

Hints

  1. A 5-minute window has exactly 300 possible second offsets. Try storing one counter per second slot.
  2. Use timestamp % 300 to choose a slot, and store the real timestamp in that slot so you can tell whether the value is stale.

Part 3: Extend the Design to Support a 24-Hour Window

Extend the counter to support a 24-hour sliding window. To keep the problem exact while reducing the number of buckets, all timestamps in this version are guaranteed to be multiples of 60 seconds, and all queries also happen at minute boundaries. That means one bucket per minute is enough. Support these operations: - ('record', key, timestamp) - ('get_qps', key, now) For a query at time now, use the inclusive 24-hour window [now - 60 * 1439, now], which contains exactly 1440 minute buckets. Return the average QPS over that full 24-hour period, so the denominator is always 86400 seconds. Return answers rounded to 6 decimal places.

Constraints

  • 0 <= len(operations) <= 200000
  • 0 <= timestamp, now <= 10^9
  • All timestamp and now values are multiples of 60
  • Operation timestamps are globally non-decreasing
  • The QPS for a query is count_in_last_24_hours / 86400.0

Examples

Input: [('record', 'api', 0), ('record', 'api', 60), ('record', 'api', 120), ('get_qps', 'api', 120)]

Expected Output: [0.000035]

Explanation: There are 3 requests in the last 24 hours, so QPS is 3 / 86400.

Input: [('record', 'api', 0), ('record', 'api', 86400), ('get_qps', 'api', 86400)]

Expected Output: [0.000012]

Explanation: At now=86400, the minute at timestamp 0 is outside the 24-hour window, so only one request remains.

Input: [('record', 'x', 0), ('record', 'x', 0), ('record', 'x', 0), ('get_qps', 'x', 0)]

Expected Output: [0.000035]

Explanation: Multiple requests in the same minute are all counted.

Input: [('get_qps', 'none', 3600)]

Expected Output: [0.0]

Explanation: Edge case: querying a key with no traffic.

Input: [('record', 'k', 0), ('record', 'k', 86340), ('get_qps', 'k', 86340)]

Expected Output: [0.000023]

Explanation: Both the oldest minute and the current minute are included because the window is inclusive.

Hints

  1. Since times are minute-aligned, convert each timestamp to a minute number with timestamp // 60.
  2. A 24-hour window has 1440 minute buckets, so a circular buffer of size 1440 per key is enough.

Part 4: Efficiently Handle Many Requests That Share the Same Timestamp

In this version, many requests may share the same key and the same timestamp. To process them efficiently, the input uses batched record operations. Support these operations: - ('record_many', key, timestamp, count): record count requests for key at timestamp - ('get_qps', key, now): return the average queries per second for key over the last 5 minutes Use the inclusive sliding window [now - 299, now]. Calls arrive in non-decreasing timestamp order. Return the answer for each get_qps query, rounded to 6 decimal places. Your solution should avoid storing one entry per request when many requests happen at the same second.

Constraints

  • 0 <= len(operations) <= 200000
  • 0 <= timestamp, now <= 10^9
  • 0 <= count <= 10^9
  • Operation timestamps are globally non-decreasing
  • The QPS for a query is count_in_window / 300.0

Examples

Input: [('record_many', 'api', 10, 100), ('get_qps', 'api', 10)]

Expected Output: [0.333333]

Explanation: 100 requests in the 300-second window gives 100 / 300.

Input: [('record_many', 'a', 5, 2), ('record_many', 'b', 5, 1), ('record_many', 'a', 5, 3), ('get_qps', 'a', 5), ('get_qps', 'b', 5)]

Expected Output: [0.016667, 0.003333]

Explanation: Key 'a' has 5 requests at timestamp 5, and key 'b' has 1.

Input: [('record_many', 'x', 1, 50), ('record_many', 'x', 300, 50), ('get_qps', 'x', 300), ('get_qps', 'x', 301)]

Expected Output: [0.333333, 0.166667]

Explanation: At now=300, both timestamps 1 and 300 are included. At now=301, timestamp 1 has expired.

Input: [('record_many', 'x', 10, 0), ('get_qps', 'x', 10), ('get_qps', 'missing', 10)]

Expected Output: [0.0, 0.0]

Explanation: Edge cases: zero-count batch and querying a missing key.

Input: []

Expected Output: []

Explanation: Edge case: no operations.

Hints

  1. Store (timestamp, count) pairs instead of one timestamp per request.
  2. Keep a running total for each key so get_qps can be answered without rescanning all buckets.
Last updated: May 1, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Choose the Best Travel Mode - Databricks (medium)
  • Implement an Alternating Tic-Tac-Toe Game - Databricks (hard)
  • Implement a Snapshot Set Iterator - Databricks (medium)
  • Find the Best Commute Mode - Databricks (medium)
  • Partition a Target String by Source Substrings - Databricks (medium)