How do I practice coding and algorithm questions?

Use PracHub's coding console to write, test, and debug your solutions in Python or JavaScript. View hints, test against sample inputs, and compare with official solutions.

What difficulty level is this coding question?

This is a medium difficulty Coding & Algorithms question, commonly asked during Technical Screen rounds at NVIDIA.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at NVIDIA during technical interviews.

Compute top-N items from log stream | NVIDIA Coding Question

Quick Overview

This question evaluates the ability to aggregate frequencies and compute top-N results from logs, covering competencies in frequency counting, top-k algorithms, and streaming data handling within the Coding & Algorithms domain at a practical application level.

Compute top-N items from log stream

Company: NVIDIA

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

## Problem You are given application logs containing events with an `itemId`. Each log line may contain extra fields, but you can extract the `itemId` from each line. Parse the logs, count how many times each `itemId` appears, then output the **top N** most frequent items. ## Input - `logs`: a list of strings, each representing one log line. - `N`: an integer. - Each log line contains an item identifier (assume it can be extracted deterministically; e.g., a token like `item=<id>` or the first whitespace-separated field). ## Output Return a list of the top `N` itemIds with their counts, sorted by: 1. descending count 2. for ties, ascending itemId (or specify and implement a deterministic tie-break). ## Constraints (reasonable assumptions) - `1 <= len(logs) <= 10^6` - Item IDs are strings or integers. - `1 <= N <= number of distinct itemIds` ## Follow-ups - How would you handle streaming logs (unbounded input)? - How would you handle very large distinct counts (memory constraints)?

Quick Answer: This question evaluates the ability to aggregate frequencies and compute top-N results from logs, covering competencies in frequency counting, top-k algorithms, and streaming data handling within the Coding & Algorithms domain at a practical application level.

You are given application logs, where each log line contains an item identifier somewhere in the text. Parse the logs, count how many times each itemId appears, and return the top N most frequent items. Extraction rule: - Split each log line by whitespace. - If any token starts with 'item=', then the itemId is the substring after 'item=' from the first such token. - Otherwise, the itemId is the first whitespace-separated token. - Ignore completely empty log lines. Return the top N itemIds with their counts, sorted by: 1. descending count 2. ascending itemId (lexicographic string order) for ties Item IDs are always treated as strings, even if they look numeric.

Constraints

0 <= len(logs) <= 10^6
0 <= N <= number of distinct extracted itemIds
Each non-empty log line can be split by whitespace into tokens
If a token starts with 'item=', use the text after '=' as the itemId; otherwise use the first token
Tie-breaking must be deterministic: ascending lexicographic order of itemId

Examples

Input: (['ts=1 user=a item=apple action=view', 'ts=2 user=b item=banana action=buy', 'ts=3 user=a item=apple action=buy', 'ts=4 user=c item=banana action=view', 'ts=5 user=d item=banana action=view', 'ts=6 user=e item=carrot action=view'], 2)

Expected Output: [['banana', 3], ['apple', 2]]

Explanation: banana appears 3 times, apple appears 2 times, and carrot appears 1 time. The top 2 are banana and apple.

Input: (['alpha user=1', 'item=beta x=1', 'item=alpha x=2', 'beta user=2'], 2)

Expected Output: [['alpha', 2], ['beta', 2]]

Explanation: alpha and beta both appear twice. With equal counts, itemIds are ordered lexicographically, so alpha comes before beta.

Input: (['item=solo'], 1)

Expected Output: [['solo', 1]]

Explanation: There is only one extracted itemId, so it is the top 1 result.

Input: ([], 0)

Expected Output: []

Explanation: There are no logs and N is 0, so the result is an empty list.

Input: (['x=1 item=dog', 'cat x=2', 'item=dog y=1', 'item=ant z=5', 'cat z=6', 'item=ant t=7'], 2)

Expected Output: [['ant', 2], ['cat', 2]]

Explanation: ant, cat, and dog each appear twice. Ties are broken by ascending itemId, so the first two are ant and cat.

Hints

Use a hash map (dictionary) to count how many times each extracted itemId appears.
After counting, sort the distinct items with a custom key like (-count, itemId) to enforce both ranking rules.

Quick Overview