Implement frequency + distance top‑K queries
Company: Google
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This pair of problems evaluates algorithmic and data-structure skills including frequency counting, multi-criteria ordering and selection for top-K results, geometric distance computation for tie-breaking, and time-indexed storage with nearest-timestamp retrieval.
Part 1: Top-K Points by Frequency with Distance Tie-Break
Constraints
- 1 <= len(points) <= 2 * 10^5
- -10^4 <= x, y <= 10^4
- 1 <= k <= number of distinct points
Examples
Input: ([[1,1],[2,2],[2,2],[1,1],[2,2],[3,3]], 2)
Expected Output: [[2, 2], [1, 1]]
Explanation: Point [2,2] appears 3 times and [1,1] appears 2 times, so they are the top 2 distinct points.
Input: ([[1,2],[0,2],[1,2],[2,1],[0,2],[2,1],[3,0]], 3)
Expected Output: [[0, 2], [1, 2], [2, 1]]
Explanation: The points [0,2], [1,2], and [2,1] each appear twice. Their squared distances are 4, 5, and 5, so [0,2] comes first. Between [1,2] and [2,1], x is smaller for [1,2].
Input: ([[-1,0],[0,-1],[-1,0],[0,-1],[1,0]], 2)
Expected Output: [[-1, 0], [0, -1]]
Explanation: [-1,0] and [0,-1] both appear twice and both have squared distance 1. Since x is smaller for [-1,0], it comes first.
Input: ([[5,-5],[5,-5],[5,-5]], 1)
Expected Output: [[5, -5]]
Explanation: There is only one distinct point, so it must be returned.
Input: ([[1,1],[-1,1],[1,-1],[-1,-1]], 3)
Expected Output: [[-1, -1], [-1, 1], [1, -1]]
Explanation: All points appear once and all have the same squared distance 2, so ordering is determined by x first, then y.
Solution
def solution(points, k):
from collections import Counter
counts = Counter(tuple(point) for point in points)
ranked = sorted(
counts.keys(),
key=lambda p: (-counts[p], p[0] * p[0] + p[1] * p[1], p[0], p[1])
)
return [[x, y] for x, y in ranked[:k]]Time complexity: O(n + m log m), where n is the number of input points and m is the number of distinct points. Space complexity: O(m).
Hints
- First count how many times each point appears. After that, each distinct point only needs to be ranked once.
- Python can sort tuples directly, so think about building a sort key like `(-frequency, distance, x, y)`.
Part 2: Time-Based Key-Value Store with Closest Timestamp Queries
Constraints
- 0 <= number of operations <= 2 * 10^5
- Keys and values are strings
- Timestamps fit in a 32-bit signed integer range
- For the same key, `set` operations arrive in non-decreasing timestamp order
Examples
Input: [['set', 'foo', 'x', 5], ['set', 'foo', 'y', 10], ['set', 'foo', 'z', 15], ['getClosest', 'foo', 11], ['getClosest', 'foo', 14], ['getClosest', 'foo', 12]]
Expected Output: ['y', 'z', 'y']
Explanation: For 11, timestamp 10 is closest. For 14, timestamp 15 is closest. For 12, timestamp 10 is closer than 15.
Input: [['set', 'a', 'one', 1], ['set', 'a', 'two', 3], ['set', 'b', 'bee', 2], ['getClosest', 'a', 2], ['getClosest', 'a', 4], ['getClosest', 'b', 10], ['getClosest', 'c', 5]]
Expected Output: ['one', 'two', 'bee', '']
Explanation: For key 'a' at time 2, timestamps 1 and 3 are equally close, so choose the smaller timestamp 1. Key 'c' does not exist, so return an empty string.
Input: [['set', 'k', 'old', 7], ['set', 'k', 'new', 7], ['getClosest', 'k', 7], ['getClosest', 'k', 6], ['getClosest', 'k', 8]]
Expected Output: ['new', 'new', 'new']
Explanation: The second set at timestamp 7 replaces the earlier value. Since only timestamp 7 exists, every query returns 'new'.
Input: []
Expected Output: []
Explanation: No operations means there are no query results.
Solution
def solution(operations):
from bisect import bisect_left
store = {}
result = []
for op in operations:
if op[0] == 'set':
_, key, value, timestamp = op
if key not in store:
store[key] = [[], []]
times, values = store[key]
if times and times[-1] == timestamp:
values[-1] = value
else:
times.append(timestamp)
values.append(value)
else:
_, key, timestamp = op
if key not in store:
result.append("")
continue
times, values = store[key]
i = bisect_left(times, timestamp)
if i < len(times) and times[i] == timestamp:
result.append(values[i])
elif i == 0:
result.append(values[0])
elif i == len(times):
result.append(values[-1])
else:
left_diff = timestamp - times[i - 1]
right_diff = times[i] - timestamp
if left_diff <= right_diff:
result.append(values[i - 1])
else:
result.append(values[i])
return resultTime complexity: O(n log n). Space complexity: O(n).
Hints
- Because timestamps for each key are already ordered, you can store them in arrays and avoid re-sorting.
- For a query timestamp, binary search gives the insertion point. The answer must be one of the neighboring timestamps.