Implement DFS with cycle detection and topo order
Company: Snowflake
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: hard
Interview Round: Technical Screen
You are given a directed graph G with N nodes labeled 1..N and M edges (u,v). In Python, implement two versions of depth-first search (DFS): (1) a recursive version and (2) an iterative version using your own stack. Both versions must:
- Return a topological ordering if G is a DAG; otherwise, detect and return any one directed cycle as an ordered list of nodes.
- Also compute and return discovery and finish timestamps for every node (timestamps start at 1 and increment per edge exploration in standard DFS order).
- Work correctly for disconnected graphs and graphs with self-loops and parallel edges.
- Run in O(N+M) time and O(N+M) space. For N up to 100,000, your iterative version must not rely on increasing Python's recursion limit.
Input format to your function: N (int), edges (list of pairs), adjacency list you build internally. Output format: a dict with keys {"is_dag": bool, "topo_order": list|None, "cycle": list|None, "discover": dict, "finish": dict}.
Edge cases to handle precisely:
- Parallel edges between the same nodes.
- A self-loop (v,v) should be reported as a cycle of length 1.
- Multiple valid topological orders are acceptable; multiple possible cycles are acceptable.
Example:
N=4, edges=[(1,2),(2,3),(3,4)] => is_dag=True, one valid topo_order=[1,2,3,4], cycle=None.
N=4, edges=[(1,2),(2,3),(3,2),(3,4)] => is_dag=False, cycle could be [2,3,2] or [3,2,3].
Explain your design choices (visited color scheme, parent tracking for cycle reconstruction) and provide unit tests for both dense and sparse graphs.
Quick Answer: This question evaluates mastery of graph algorithms—specifically depth-first search (both recursive and iterative), directed cycle detection, topological ordering, and discovery/finish timestamping—and falls under the Coding & Algorithms domain.