PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates understanding of graph algorithms and data validation, specifically dependency analysis, cycle detection, and topological ordering.

  • Medium
  • Google
  • Coding & Algorithms
  • Software Engineer

Validate course catalog dependencies

Company: Google

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Design a function to validate an e-learning course catalog. You are given: ( 1) a set of course IDs, and ( 2) a list of prerequisite pairs (u, v) meaning course v requires course u. Determine if the catalog is valid: no prerequisite references a non-existent course, and there are no cycles in the dependency graph. If valid, return true and a possible ordering of courses; otherwise return false and report the set of missing course references and at least one cycle you detect. Specify your algorithm, time and space complexity.

Quick Answer: This question evaluates understanding of graph algorithms and data validation, specifically dependency analysis, cycle detection, and topological ordering.

You are validating an e-learning course catalog. You are given: 1. `course_ids` — a list of distinct course identifiers (the catalog). 2. `prereqs` — a list of prerequisite pairs `(u, v)`, where each pair means "course `v` requires course `u`" (i.e. a directed edge `u -> v`, you must take `u` before `v`). Determine whether the catalog is valid. A catalog is **valid** when BOTH conditions hold: - **No missing references:** every course that appears in any prerequisite pair exists in `course_ids`. - **No cycles:** the dependency graph (over courses that exist in the catalog) is acyclic. Return a tuple `(valid, ordering, missing, cycle)`: - `valid` — `True` if and only if there are no missing references and no cycles. - `ordering` — a valid topological ordering of all catalog courses **only when `valid` is `True`**. When a cycle exists, `ordering` is `[]`. When the graph is acyclic but references are missing, `ordering` still holds a valid topological order of the existing courses (the graph itself is well-formed; only the references are dangling). - `missing` — the sorted list of distinct course IDs referenced by some prerequisite pair but absent from `course_ids` (empty list if none). - `cycle` — when a cycle exists, one concrete cycle as a list of course IDs where the first and last element are the same (e.g. `['A', 'B', 'C', 'A']`); otherwise `[]`. Resolve ties deterministically: when multiple courses are simultaneously available in the topological sort, emit them in ascending (lexicographic) order. State your algorithm and its time and space complexity.

Constraints

  • 0 <= number of courses <= 10^5
  • 0 <= number of prerequisite pairs <= 2 * 10^5
  • Course IDs are distinct strings within course_ids.
  • A prerequisite pair (u, v) may reference an ID not present in course_ids; such IDs must be reported in `missing`.
  • A pair (u, v) means u must be completed before v (directed edge u -> v).

Examples

Input: (['A', 'B', 'C'], [('A', 'B'), ('B', 'C')])

Expected Output: (True, ['A', 'B', 'C'], [], [])

Explanation: Linear chain A->B->C is acyclic with all references present, so it is valid and the unique topological order is [A, B, C].

Input: (['A', 'B', 'C'], [('A', 'B'), ('B', 'C'), ('C', 'A')])

Expected Output: (False, [], [], ['A', 'B', 'C', 'A'])

Explanation: A->B->C->A forms a cycle, so the catalog is invalid, no ordering is returned, and one detected cycle is A->B->C->A.

Input: (['A', 'B'], [('A', 'B'), ('A', 'X')])

Expected Output: (False, ['A', 'B'], ['X'], [])

Explanation: X is referenced as requiring A but is not in the catalog, so it is a missing reference. The existing graph (A->B) is acyclic, so an ordering of existing courses is still produced, but the catalog is invalid.

Input: ([], [])

Expected Output: (True, [], [], [])

Explanation: An empty catalog with no prerequisites is trivially valid with an empty ordering.

Input: (['A'], [])

Expected Output: (True, ['A'], [], [])

Explanation: A single course with no prerequisites is valid; its ordering is just [A].

Input: (['A', 'B', 'C', 'D'], [('A', 'B'), ('A', 'C'), ('B', 'D'), ('C', 'D')])

Expected Output: (True, ['A', 'B', 'C', 'D'], [], [])

Explanation: A diamond DAG (A before B and C, both before D). With lexicographic tie-breaking the order is [A, B, C, D].

Input: (['A', 'B'], [('A', 'B'), ('B', 'A')])

Expected Output: (False, [], [], ['A', 'B', 'A'])

Explanation: A and B require each other, forming a 2-node cycle. The catalog is invalid and one detected cycle is A->B->A.

Hints

  1. Separate the two failure modes: a missing reference (an ID in some pair that is not in the catalog) and a cycle (a circular dependency). Scan all pairs once to collect missing IDs before touching the graph.
  2. Build the dependency graph only over courses that actually exist, with edge u -> v for each pair (u, v). Track in-degree per node.
  3. Use Kahn's algorithm (BFS on in-degree-zero nodes). If the produced ordering contains every catalog course, the graph is acyclic; if it is shorter, a cycle exists among the nodes that never reached in-degree zero.
  4. To report one concrete cycle, run a DFS restricted to the nodes that still have positive in-degree after Kahn's sort; the first gray (on-stack) node you revisit closes a cycle — slice it out of the current DFS path.
  5. For deterministic output, break ties in lexicographic order whenever multiple nodes are simultaneously ready.
Last updated: Jun 25, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Infection Spread on a Grid (Cellular Automaton) - Google (hard)
  • Boolean Expression Tree with Leaf Flips - Google (medium)
  • Streaming Points: Remove Any Pair Within a Distance - Google (medium)
  • Most Active Users in a Live Communication Stream - Google (medium)
  • Solve Rooms and Top-K Streams - Google (medium)