PracHub
QuestionsCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates mastery of graph algorithms—specifically depth-first search (both recursive and iterative), directed cycle detection, topological ordering, and discovery/finish timestamping—and falls under the Coding & Algorithms domain.

  • hard
  • Snowflake
  • Coding & Algorithms
  • Data Scientist

Implement DFS with cycle detection and topo order

Company: Snowflake

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: hard

Interview Round: Technical Screen

You are given a directed graph G with N nodes labeled 1..N and M edges (u,v). In Python, implement two versions of depth-first search (DFS): (1) a recursive version and (2) an iterative version using your own stack. Both versions must: - Return a topological ordering if G is a DAG; otherwise, detect and return any one directed cycle as an ordered list of nodes. - Also compute and return discovery and finish timestamps for every node (timestamps start at 1 and increment per edge exploration in standard DFS order). - Work correctly for disconnected graphs and graphs with self-loops and parallel edges. - Run in O(N+M) time and O(N+M) space. For N up to 100,000, your iterative version must not rely on increasing Python's recursion limit. Input format to your function: N (int), edges (list of pairs), adjacency list you build internally. Output format: a dict with keys {"is_dag": bool, "topo_order": list|None, "cycle": list|None, "discover": dict, "finish": dict}. Edge cases to handle precisely: - Parallel edges between the same nodes. - A self-loop (v,v) should be reported as a cycle of length 1. - Multiple valid topological orders are acceptable; multiple possible cycles are acceptable. Example: N=4, edges=[(1,2),(2,3),(3,4)] => is_dag=True, one valid topo_order=[1,2,3,4], cycle=None. N=4, edges=[(1,2),(2,3),(3,2),(3,4)] => is_dag=False, cycle could be [2,3,2] or [3,2,3]. Explain your design choices (visited color scheme, parent tracking for cycle reconstruction) and provide unit tests for both dense and sparse graphs.

Quick Answer: This question evaluates mastery of graph algorithms—specifically depth-first search (both recursive and iterative), directed cycle detection, topological ordering, and discovery/finish timestamping—and falls under the Coding & Algorithms domain.

You are given a directed graph G with N nodes labeled 1..N and a list of M directed edges (u, v). Implement `solution(N, edges)` that builds an adjacency list internally and runs a depth-first search to produce: - `is_dag` (bool): True if G is acyclic. - `topo_order` (list | None): a valid topological ordering if G is a DAG, else None. - `cycle` (list | None): if G has a cycle, any one directed cycle as an ordered list of nodes that starts and ends at the same node (e.g. [2, 3, 2]); else None. A self-loop (v, v) must be reported as [v, v]. - `discover` (dict): map node -> discovery timestamp. - `finish` (dict): map node -> finish timestamp. Timestamps follow standard CLRS DFS timing: a single global clock starts at 1 and increments by 1 on each vertex discovery and each vertex finish. The graph may be disconnected, may contain self-loops, and may contain parallel edges. The implementation must run in O(N+M) time and O(N+M) space and must not rely on increasing the recursion limit (use an explicit stack so it works for N up to 100,000). To make the output deterministic for grading, process root nodes in ascending label order and scan each node's neighbors in ascending order. Examples: - N=4, edges=[(1,2),(2,3),(3,4)] -> is_dag=True, topo_order=[1,2,3,4], cycle=None. - N=4, edges=[(1,2),(2,3),(3,2),(3,4)] -> is_dag=False, cycle=[2,3,2].

Constraints

  • 1 <= N <= 100000
  • 0 <= M <= 2*10^5
  • 1 <= u, v <= N for every edge (u, v)
  • Self-loops (u == v) and parallel edges are allowed
  • The graph may be disconnected
  • The iterative DFS must not rely on raising Python's recursion limit

Examples

Input: (4, [(1, 2), (2, 3), (3, 4)])

Expected Output: {'is_dag': True, 'topo_order': [1, 2, 3, 4], 'cycle': None, 'discover': {1: 1, 2: 2, 3: 3, 4: 4}, 'finish': {1: 8, 2: 7, 3: 6, 4: 5}}

Explanation: Simple DAG chain 1->2->3->4. Valid topo order [1,2,3,4]; no cycle. CLRS timestamps increment on each discover and finish.

Input: (4, [(1, 2), (2, 3), (3, 2), (3, 4)])

Expected Output: {'is_dag': False, 'topo_order': None, 'cycle': [2, 3, 2], 'discover': {1: 1, 2: 2, 3: 3, 4: 4}, 'finish': {1: 8, 2: 7, 3: 6, 4: 5}}

Explanation: Edge 3->2 is a back edge (2 is GRAY ancestor of 3), so a directed cycle [2,3,2] is reported and topo_order is None.

Input: (3, [(2, 2)])

Expected Output: {'is_dag': False, 'topo_order': None, 'cycle': [2, 2], 'discover': {1: 1, 2: 3, 3: 5}, 'finish': {1: 2, 2: 4, 3: 6}}

Explanation: Self-loop on node 2: when scanning 2's neighbors, 2 is still GRAY, so the cycle is reported as [2,2].

Input: (3, [(1, 2), (1, 2), (2, 3)])

Expected Output: {'is_dag': True, 'topo_order': [1, 2, 3], 'cycle': None, 'discover': {1: 1, 2: 2, 3: 3}, 'finish': {1: 6, 2: 5, 3: 4}}

Explanation: Parallel edges (1->2 twice) are processed but the second is ignored once 2 is no longer WHITE; the graph is still a DAG.

Input: (5, [(1, 2), (3, 4)])

Expected Output: {'is_dag': True, 'topo_order': [5, 3, 4, 1, 2], 'cycle': None, 'discover': {1: 1, 2: 2, 3: 5, 4: 6, 5: 9}, 'finish': {1: 4, 2: 3, 3: 8, 4: 7, 5: 10}}

Explanation: Disconnected graph: components {1,2}, {3,4}, and isolated node 5. DFS starts from each WHITE root in ascending order; timestamps continue across components.

Input: (1, [])

Expected Output: {'is_dag': True, 'topo_order': [1], 'cycle': None, 'discover': {1: 1}, 'finish': {1: 2}}

Explanation: Single node, no edges. Trivial DAG: discover=1, finish=2, topo_order=[1].

Input: (2, [(1, 2), (2, 1)])

Expected Output: {'is_dag': False, 'topo_order': None, 'cycle': [1, 2, 1], 'discover': {1: 1, 2: 2}, 'finish': {1: 4, 2: 3}}

Explanation: Two-node cycle 1->2->1. Edge 2->1 hits GRAY ancestor 1, yielding cycle [1,2,1].

Hints

  1. Use the white/gray/black coloring scheme: WHITE=undiscovered, GRAY=on the current DFS stack, BLACK=finished. An edge to a GRAY node is a back edge => a directed cycle.
  2. Run an explicit-stack iterative DFS where each stack frame stores (node, next_neighbor_index) so you can pause and resume the neighbor scan without recursion.
  3. Keep one global clock; increment it on discovery and again on finish. The topological order (for a DAG) is the reverse of the order in which nodes finish.
  4. Track a parent pointer when you first discover a node. On a back edge u->v, walk parents from u up to v and append v to reconstruct the cycle; a self-loop is the special case u == v giving [u, u].
Last updated: Jun 26, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Implement a JSON Parser - Snowflake (hard)
  • Solve Matrix and Array Distance Problems - Snowflake (medium)
  • Solve Array Distance and Wiki Navigation - Snowflake (medium)
  • Implement Document Predicate APIs - Snowflake (medium)
  • Find Shortest Wiki Click Path - Snowflake (medium)