Find most frequent log user
Company: Amazon
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
##### Question
You are given a list of log entries in string format, where each entry contains a username and an access timestamp separated by a delimiter (e.g., "alice 2025-07-27T12:30:00Z"). Design an algorithm that processes the logs and returns the username that appears most frequently (i.e., the user with the highest number of accesses). If multiple users tie for the highest frequency, you may return any one of them. Discuss the time- and space-complexity of your solution.
Quick Answer: This question evaluates frequency-counting and log-parsing skills along with algorithmic reasoning about time and space complexity for identifying the most frequent element in a dataset.
You are given a list of log entry strings. Each entry is formatted as 'username timestamp' where username is the substring before the first whitespace and timestamp follows after at least one space (e.g., 'alice 2025-07-27T12:30:00Z'). Return the username that appears most frequently across all entries. If multiple users tie for the highest frequency, return any one of them. Usernames are case-sensitive. The input list is non-empty and every entry contains at least two whitespace-separated tokens.
Constraints
- 1 <= len(logs) <= 200000
- Each entry contains at least two tokens separated by whitespace
- Username has no whitespace
- Timestamps may be any strings without whitespace; their format is irrelevant
- Usernames are case-sensitive
- Return any one username if multiple have the same maximum frequency
Examples
Input:
Expected Output: alice
Input:
Expected Output: charlie
Solution
from typing import List, Dict
from collections import defaultdict
def most_frequent_user(logs: List[str]) -> str:
counts: Dict[str, int] = defaultdict(int)
for entry in logs:
user = entry.split()[0]
counts[user] += 1
return max(counts.items(), key=lambda kv: kv[1])[0]
Explanation
Count occurrences of each username using a dictionary where keys are usernames and values are counts. Extract the username as the first whitespace-separated token from each log entry. Finally, return the username with the maximum count; in case of ties, any one of the tied usernames may be returned.
Time complexity: O(n). Space complexity: O(k).
Hints
- Split each log entry by whitespace and use the first token as the username.
- Use a hash map (dictionary) to count the frequency of each username.
- After counting, select the username with the maximum count.