Rank queries by prefix, frequency, and time
Company: Roblox
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
You are given three arrays of the same length `n`:
- `queries[i]`: a non-empty string representing the text of the i-th query.
- `timestamps[i]`: an integer representing the time when `queries[i]` was issued.
- `0 <= i < n`.
You are also given an array `prefixes` of length `m`, where each `prefixes[j]` is a non-empty string.
For each prefix `p` in `prefixes`, you must find **all distinct query strings** that start with `p` (i.e., `query.startsWith(p)`), then sort those distinct queries by:
1. **Descending frequency**: the number of times that exact query string appears in `queries`.
2. If multiple queries have the same frequency, break ties by **ascending earliest timestamp**: for a given query string `q`, look at all indices `k` where `queries[k] == q` and take `min(timestamps[k])`; the query with the smaller earliest timestamp comes first.
Return, for each prefix `prefixes[j]`, the list of matching query strings sorted according to the above rules.
Assume:
- `1 <= n, m <= 2 * 10^5` (large enough that an \(O(nm)\) solution will time out).
- `1 <= len(queries[i]), len(prefixes[j]) <= 50`.
- `timestamps[i]` fit in a 32-bit signed integer.
You may return the result in any reasonable structure; for example, an array `result` of length `m`, where `result[j]` is a list of strings for prefix `prefixes[j]`.
**Example**
Input:
- `queries = ["apple", "app", "banana", "app", "application"]`
- `timestamps = [5, 3, 10, 7, 8]`
- `prefixes = ["ap", "app", "b"]`
Processing:
- For prefix `"ap"`:
- Matching queries: `"apple"`, `"app"`, `"app"`, `"application"`.
- Distinct query strings and frequencies:
- `"app"`: appears 2 times, earliest timestamp = `min(3, 7) = 3`.
- `"apple"`: appears 1 time, earliest timestamp = `5`.
- `"application"`: appears 1 time, earliest timestamp = `8`.
- Sorted: `"app"` (freq 2), then `"apple"` (freq 1, time 5), then `"application"` (freq 1, time 8).
- For prefix `"app"`:
- Matching queries: `"apple"`, `"app"`, `"app"`, `"application"` (same set as above since all start with "app").
- Sorted order is the same as for `"ap"`.
- For prefix `"b"`:
- Matching queries: `"banana"` only, with frequency 1 and earliest timestamp 10.
Output (conceptually):
- For `"ap"`: `["app", "apple", "application"]`
- For `"app"`: `["app", "apple", "application"]`
- For `"b"`: `["banana"]`
Quick Answer: This question evaluates understanding of string processing, frequency aggregation, earliest-timestamp tracking, and efficient prefix-based retrieval for large datasets.