Find top video category by average time
Company: Pinterest
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
You are given a pandas DataFrame 'pins' with columns [pin_id:int, category_id:int, time_spent_sec:float, pin_format:string] and a dict 'category_map' mapping category_id -> category_name. Write Python to return a tuple (category_name, avg_time) for the category with the highest average time_spent_sec among rows where pin_format == 'video'. Requirements: exclude rows with null/NaN category_id or nonpositive time; break ties by lexicographically smallest category_name; time complexity O(n), extra space O(k) where k is distinct categories; round avg_time to two decimals.
Sample input:
'pins'
pin_id | category_id | time_spent_sec | pin_format
1 | 10 | 12 | video
2 | 10 | 20 | static
3 | 11 | 30 | video
4 | null | 25 | video
5 | 11 | 0 | video
category_map = {10: "Food", 11: "Travel"}
Expected output on sample: ("Travel", 30.00).
Quick Answer: This question evaluates a candidate's ability to perform data manipulation and aggregation in Python (pandas), covering skills such as filtering, handling nulls and nonpositive values, mapping categorical IDs to names, rounding numerical results, and applying tie-breaking rules.