Build referral chains and detect cycles

Q: Build referral chains and detect cycles

This is a Coding & Algorithms interview question from Block (Square) for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

Question

You are given a directed referral graph where each user may have at most one referrer. Input file 'referrals.csv' has columns: user_id (INT), referred_by (INT, nullable). Assumptions: up to 1e6 users; referred_by may reference a user_id not present in the file; rows may contain duplicates; self-referrals and multi-node cycles may exist.

Write Python code to accomplish all of the following:

Implement chain(u) that returns the referral chain for user u from the earliest ancestor (root) to u as a list of user_ids. If any cycle is encountered on the path, detect it and return both: (a) the simple cycle nodes in encounter order, and (b) the acyclic prefix leading into the cycle; avoid infinite loops.
In O(n) time and O(n) memory overall, compute for every user: (root_ancestor[u], chain_depth[u]) where chain_depth[u] is the number of unique referrers on the path to the root. Do not recompute paths from scratch per user; use memoization/union-find/DFS with coloring (your choice) and justify complexity.
Return the top-3 longest valid acyclic chains as lists. Break ties by smaller root_ancestor id, then lexicographically smaller full chain list.
Robustness: describe (and implement) preprocessing to (a) deduplicate rows, keeping the earliest seen parent for a user if duplicates conflict, (b) normalize null/empty referred_by, (c) tolerate referred_by not found in user_id (treat as external root), and (d) flag self-referrals.
Provide minimal unit tests that cover: acyclic chain, self-cycle, and a 2-node cycle.

Tiny example (use exactly these rows for tests): referrals.csv user_id,referred_by 1, 2,1 3,1 4,2 5,4 6,5 7,7 8,9 9,8

Expected behaviors to assert (do not print answers here):

chain(6) traverses 1→2→4→5→6; depth(6)=4; root=1
chain(7) reports a self-cycle at 7
chain(8) reports the 2-node cycle {8,9}.

Build referral chains and detect cycles

Comments (0)