Return Top K Relevant Apps
Company: Microsoft
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Onsite
Quick Answer: This question evaluates algorithmic problem-solving around aggregation and ranking, testing skills in computing aggregated relevance scores, deterministic tie-breaking, and efficient selection of top-k items.
Constraints
- 0 <= number of applications <= 10^5
- 0 <= total number of keyword entries across all applications <= 2 * 10^5
- k may be 0 or larger than the number of applications
- Relevance scores are real numbers, and an application with no keywords has total relevance 0
Examples
Input: ({"github": {"diff": 0.3, "code review": 0.8}, "jira": {"ticket": 0.7, "workflow": 0.4}, "slack": {"chat": 0.5}}, 2)
Expected Output: ["github", "jira"]
Explanation: github and jira both have total relevance 1.1, which is higher than slack's 0.5. The tie is broken lexicographically, so github comes before jira.
Input: ({"zoom": {"meet": 0.5}, "asana": {"task": 0.5}, "notion": {"doc": 0.7}}, 2)
Expected Output: ["notion", "asana"]
Explanation: notion has the highest total relevance at 0.7. asana and zoom are tied at 0.5, and asana comes first lexicographically.
Input: ({"figma": {"design": 0.9}, "miro": {}}, 5)
Expected Output: ["figma", "miro"]
Explanation: figma has total relevance 0.9, while miro has no keywords and totals 0. Since k exceeds the number of applications, return all of them in ranked order.
Input: ({}, 3)
Expected Output: []
Explanation: There are no applications, so the result is an empty list.
Input: ({"alpha": {"x": 1.0}, "beta": {"y": 2.0}}, 0)
Expected Output: []
Explanation: When k is 0, no applications should be returned.
Hints
- First reduce each application to a single total score by summing its keyword relevance values.
- After computing totals, think about ordering pairs like (app_name, total_score) by total descending and name ascending; a size-k heap can avoid sorting everything.