Compute Averages of Unique Numbers in Dictionary Lists
Company: TikTok
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
##### Scenario
Python tech screen: given a dictionary mapping keys to numeric lists, e.g., {'a':[1,2,1],'b':[1,2,3]}, compute the average of each list after removing duplicates.
##### Question
Write Python code that takes any such dictionary and returns a new dictionary whose values are the average of the unique numbers in each original list.
##### Hints
Deduplicate each list (set or list(dict.fromkeys())), then take the mean.
Quick Answer: This question evaluates data manipulation and aggregation skills, focusing on handling duplicate values within collections and computing numerical summaries of list data.
Given a dictionary mapping strings to lists of numbers (integers or floats), return a new dictionary mapping each key to the arithmetic mean of the unique numbers in its list. Duplicates within a list are ignored. If a list is empty, return 0.0 for that key.
Constraints
- 0 <= number of keys <= 10^4
- 0 <= length of each list <= 10^5
- Sum of lengths across all lists <= 2 * 10^5
- Values are integers or floats in the range [-1e9, 1e9]
- Integers and floats equal by value (e.g., 1 and 1.0) are considered duplicates
- If a list is empty, the average for that key is 0.0
Solution
from typing import Dict, List
def average_of_unique(data: Dict[str, List[float]]) -> Dict[str, float]:
result: Dict[str, float] = {}
for key, values in data.items():
unique_values = set(values)
if unique_values:
avg = sum(unique_values) / float(len(unique_values))
else:
avg = 0.0
result[key] = float(avg)
return result
Explanation
For each key, convert its list to a set to remove duplicates, then compute the arithmetic mean of the unique values. If the list is empty, define the average as 0.0. Using a set provides O(n) processing per list and treats equal integers and floats as duplicates.
Time complexity: O(N) where N is the total number of elements across all lists. Space complexity: O(U + K) where U is the maximum number of unique values in any one list and K is the number of keys.
Hints
- Use a set to deduplicate each list before averaging.
- Compute the mean as sum(unique)/len(unique); if unique is empty, use 0.0.
- Note that 1 and 1.0 are treated as the same value when deduplicating.