Compute top-N items from log stream
Company: NVIDIA
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to aggregate frequencies and compute top-N results from logs, covering competencies in frequency counting, top-k algorithms, and streaming data handling within the Coding & Algorithms domain at a practical application level.
Constraints
- 0 <= len(logs) <= 10^6
- 0 <= N <= number of distinct extracted itemIds
- Each non-empty log line can be split by whitespace into tokens
- If a token starts with 'item=', use the text after '=' as the itemId; otherwise use the first token
- Tie-breaking must be deterministic: ascending lexicographic order of itemId
Examples
Input: (['ts=1 user=a item=apple action=view', 'ts=2 user=b item=banana action=buy', 'ts=3 user=a item=apple action=buy', 'ts=4 user=c item=banana action=view', 'ts=5 user=d item=banana action=view', 'ts=6 user=e item=carrot action=view'], 2)
Expected Output: [['banana', 3], ['apple', 2]]
Explanation: banana appears 3 times, apple appears 2 times, and carrot appears 1 time. The top 2 are banana and apple.
Input: (['alpha user=1', 'item=beta x=1', 'item=alpha x=2', 'beta user=2'], 2)
Expected Output: [['alpha', 2], ['beta', 2]]
Explanation: alpha and beta both appear twice. With equal counts, itemIds are ordered lexicographically, so alpha comes before beta.
Input: (['item=solo'], 1)
Expected Output: [['solo', 1]]
Explanation: There is only one extracted itemId, so it is the top 1 result.
Input: ([], 0)
Expected Output: []
Explanation: There are no logs and N is 0, so the result is an empty list.
Input: (['x=1 item=dog', 'cat x=2', 'item=dog y=1', 'item=ant z=5', 'cat z=6', 'item=ant t=7'], 2)
Expected Output: [['ant', 2], ['cat', 2]]
Explanation: ant, cat, and dog each appear twice. Ties are broken by ascending itemId, so the first two are ant and cat.
Hints
- Use a hash map (dictionary) to count how many times each extracted itemId appears.
- After counting, sort the distinct items with a custom key like (-count, itemId) to enforce both ranking rules.