Find common friends from directed edges

Q: Find common friends from directed edges

This is a Data Manipulation (SQL/Python) interview question from Microsoft for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

You have a directed edge list that records who followed whom. A mutual “friendship” exists only if both directions appear (A→B and B→A). Schema and sample data:

Schema: FriendEdges(id INT PRIMARY KEY, user_from VARCHAR(10), user_to VARCHAR(10))

Sample rows: id | user_from | user_to 1 | A | B 2 | B | A 3 | A | C 4 | C | A 5 | A | D 6 | D | A 7 | B | C 8 | C | B

Tasks:

Write a single SQL query that derives an undirected friendship table Friends(u1, u2) containing one row per friendship with u1 < u2, based on reciprocal edges in FriendEdges.
Using only FriendEdges (no temporary tables), write a single SQL query that returns, for every unordered pair of distinct users (x, y) with x < y, all of their common friends f (users who are friends with both x and y). Output columns: user1, user2, common_friend. Exclude the pair themselves from being counted as their own friend.
Extend (2) to also return, per pair (x, y), the count of distinct common friends. Ensure no duplicates even if multiple reciprocal edges are present.
Explain the indexes you would add on FriendEdges to make (2) performant on 100M rows, and why.

Find common friends from directed edges

Comments (0)