Find linked merchants by shared fields
Company: Stripe
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: medium
Interview Round: Technical Screen
Quick Answer: This question evaluates the ability to model and detect entity linkages via exact field matching and weighted scoring as well as to reason about reachability in a graph formed by those links, testing competencies in data modeling, graph algorithms, and handling null/empty values.
Find Linked Merchants — Part 1: Direct Matches by Shared Field
Constraints
- 1 <= number of merchants <= 10^5
- id values are unique integers
- name, email, phone are strings and may be empty ("") or null (None)
- an empty or null field never participates in a match
Examples
Input: ([{'id': 1, 'name': 'Acme', 'email': 'a@x.com', 'phone': '111'}, {'id': 2, 'name': 'Acme', 'email': 'b@x.com', 'phone': '222'}, {'id': 3, 'name': 'Beta', 'email': 'a@x.com', 'phone': '333'}, {'id': 4, 'name': 'Gamma', 'email': 'g@x.com', 'phone': '444'}], 1)
Expected Output: [2, 3]
Explanation: Merchant 2 shares name 'Acme' with target 1; merchant 3 shares email 'a@x.com'. Merchant 4 shares nothing.
Input: ([{'id': 1, 'name': '', 'email': '', 'phone': None}, {'id': 2, 'name': '', 'email': '', 'phone': None}], 1)
Expected Output: []
Explanation: All fields of both merchants are empty/null, so no comparison ever counts as a match.
Input: ([{'id': 1, 'name': 'Solo', 'email': 's@x.com', 'phone': '999'}], 1)
Expected Output: []
Explanation: Only the target exists; there are no other merchants to match.
Input: ([{'id': 1, 'name': 'A', 'email': None, 'phone': '111'}, {'id': 2, 'name': 'B', 'email': None, 'phone': '111'}, {'id': 3, 'name': 'A', 'email': None, 'phone': None}], 1)
Expected Output: [2, 3]
Explanation: Merchant 2 shares phone '111'; merchant 3 shares name 'A'. The null emails are ignored.
Input: ([{'id': 1, 'name': 'X', 'email': 'x@y.com', 'phone': '1'}, {'id': 2, 'name': 'Z', 'email': 'z@y.com', 'phone': '2'}], 99)
Expected Output: []
Explanation: Target id 99 does not exist, so the result is empty.
Hints
- First locate the target merchant by scanning for its id; if absent, return an empty list.
- For each other merchant, compare each of name/email/phone and skip any comparison where either side is empty or null.
- A single shared non-empty field is enough — you can stop checking that merchant as soon as one field matches.
Find Linked Merchants — Part 2: Weighted Scoring with Threshold
Constraints
- 1 <= number of merchants <= 10^5
- name_score, email_score, phone_score, threshold are non-negative integers
- an empty or null field contributes 0 (it can never match)
- a candidate is linked iff its total matched-field score >= threshold
Examples
Input: ([{'id': 1, 'name': 'Acme', 'email': 'a@x.com', 'phone': '111'}, {'id': 2, 'name': 'Acme', 'email': 'b@x.com', 'phone': '222'}, {'id': 3, 'name': 'Acme', 'email': 'a@x.com', 'phone': '333'}, {'id': 4, 'name': 'Beta', 'email': 'g@x.com', 'phone': '111'}], 1, 1, 3, 2, 4)
Expected Output: [3]
Explanation: Scores name=1,email=3,phone=2,threshold=4. Merchant 3 matches name(1)+email(3)=4>=4. Merchant 2 only name=1<4. Merchant 4 only phone=2<4.
Input: ([{'id': 1, 'name': 'Acme', 'email': 'a@x.com', 'phone': '111'}, {'id': 2, 'name': 'Acme', 'email': 'b@x.com', 'phone': '222'}], 1, 1, 3, 2, 1)
Expected Output: [2]
Explanation: Merchant 2 matches name only (score 1), which meets threshold 1.
Input: ([{'id': 1, 'name': 'A', 'email': 'a@x.com', 'phone': '1'}, {'id': 2, 'name': 'A', 'email': 'b@x.com', 'phone': '2'}], 1, 1, 3, 2, 2)
Expected Output: []
Explanation: Merchant 2 matches name only (score 1) which is below threshold 2, so nobody is linked.
Input: ([{'id': 1, 'name': '', 'email': None, 'phone': ''}, {'id': 2, 'name': '', 'email': None, 'phone': ''}], 1, 1, 3, 2, 1)
Expected Output: []
Explanation: All fields are empty/null so total_score is 0 < threshold 1.
Input: ([{'id': 1, 'name': 'A', 'email': 'a@x.com', 'phone': '1'}, {'id': 2, 'name': 'A', 'email': 'a@x.com', 'phone': '1'}, {'id': 3, 'name': 'B', 'email': 'b@x.com', 'phone': '2'}], 1, 1, 3, 2, 6)
Expected Output: [2]
Explanation: Merchant 2 matches all three fields: 1+3+2=6>=6. Merchant 3 matches nothing.
Input: ([{'id': 1, 'name': 'Solo', 'email': 's@x.com', 'phone': '9'}], 1, 1, 3, 2, 1)
Expected Output: []
Explanation: Only the target exists, so there are no candidates.
Hints
- Reuse the field-matching logic from Part 1, but instead of a boolean, accumulate the matching field's weight.
- Compare the accumulated total_score against threshold using >= (the boundary value counts as linked).
- Fields that are empty/null on either side add nothing to the score.
Find Linked Merchants — Part 3: Reachable Within 2 Hops
Constraints
- 1 <= number of merchants <= 10^4 (pairwise edge build is O(n^2))
- An edge exists iff the pair's matched-field score >= threshold (Part 2 rule)
- Distances are capped at 2 hops; farther nodes are excluded
- Cycles must not cause infinite loops; each reachable merchant appears once
Examples
Input: ([{'id': 1, 'name': 'A', 'email': 'a', 'phone': '111'}, {'id': 2, 'name': 'B', 'email': 'shared', 'phone': '111'}, {'id': 3, 'name': 'C', 'email': 'shared', 'phone': '333'}, {'id': 4, 'name': 'D', 'email': 'd', 'phone': '333'}], 1, 1, 3, 2, 2)
Expected Output: [2, 3]
Explanation: Edges (threshold 2): 1-2 via phone 111 (score 2), 2-3 via email 'shared' (score 3), 3-4 via phone 333 (score 2). From 1: hop1=2, hop2=3. Merchant 4 is 3 hops away, so it is excluded.
Input: ([{'id': 1, 'name': 'Solo', 'email': 's', 'phone': '9'}, {'id': 2, 'name': 'B', 'email': 'b', 'phone': '8'}, {'id': 3, 'name': 'B', 'email': 'b', 'phone': '8'}], 1, 1, 3, 2, 1)
Expected Output: []
Explanation: The target (1) shares no field with anyone, so it has no edges and nothing is reachable. Merchants 2 and 3 form a separate component.
Input: ([{'id': 1, 'name': 'Hub', 'email': 'h', 'phone': '1'}, {'id': 2, 'name': 'Hub', 'email': 'x', 'phone': '2'}, {'id': 3, 'name': 'Hub', 'email': 'y', 'phone': '3'}, {'id': 4, 'name': 'Other', 'email': 'z', 'phone': '4'}], 1, 1, 3, 2, 1)
Expected Output: [2, 3]
Explanation: Name match scores 1 >= threshold 1, so 1-2 and 1-3 are edges (shared name 'Hub'). Merchant 4 shares nothing with the target.
Input: ([{'id': 1, 'name': 'a', 'email': 'e1', 'phone': 'P'}, {'id': 2, 'name': 'b', 'email': 'e2', 'phone': 'P'}, {'id': 3, 'name': 'c', 'email': 'e3', 'phone': 'P'}], 1, 1, 3, 2, 2)
Expected Output: [2, 3]
Explanation: All three share phone 'P' (score 2 each), forming a triangle. From 1, both 2 and 3 are 1 hop away; the cycle does not cause an infinite loop.
Input: ([{'id': 1, 'name': 'A', 'email': 'a', 'phone': '111'}, {'id': 2, 'name': 'B', 'email': 'shared', 'phone': '111'}, {'id': 3, 'name': 'C', 'email': 'shared', 'phone': '333'}, {'id': 4, 'name': 'D', 'email': 'd', 'phone': '333'}], 99, 1, 3, 2, 2)
Expected Output: []
Explanation: Target id 99 does not exist, so the result is empty.
Input: ([{'id': 1, 'name': 'n1', 'email': 'shared1', 'phone': 'p1'}, {'id': 2, 'name': 'n2', 'email': 'shared1', 'phone': 'pX'}, {'id': 3, 'name': 'n3', 'email': 'e3', 'phone': 'pX'}], 1, 1, 3, 2, 2)
Expected Output: [2, 3]
Explanation: 1-2 via email 'shared1' (score 3), 2-3 via phone 'pX' (score 2). Merchant 3 is reached at exactly 2 hops through intermediate 2; there is no direct 1-3 edge.
Hints
- Build the undirected adjacency list by applying the Part 2 linking test to every pair of merchants.
- Run a BFS from the target, tracking each node's distance; stop expanding a node once its distance reaches 2.
- Mark nodes as visited (e.g. via the distance map) so a cycle never re-enqueues an already-seen merchant.
- Exclude the target (distance 0) from the result and return the remaining ids sorted.