Design a Product Catalog Service with Hierarchical Categories
Context
Design a read-heavy product catalog for an e-commerce application. Categories are hierarchical (tree) and implemented using a self-referencing relational table. Products can belong to one or more categories. The service must expose read/write APIs, support efficient subtree queries, and scale for high-traffic browsing.
Assumptions (you may refine during discussion):
-
Primary database is relational (e.g., PostgreSQL/MySQL).
-
Category updates are infrequent compared to reads; product updates are moderate; product browsing is heavy.
-
Soft-deletes preferred; strong consistency for writes within the database; eventually consistent caches/search.
Requirements
-
Define core APIs (read/write):
-
createCategory, updateCategory, moveCategory, deleteCategory
-
listChildren, getBreadcrumbs, getSubtreeSummary
-
createProduct, updateAttributes
-
attachProductToCategory, detachProductFromCategory
-
search/filter/paginate products within a category (optionally include descendants)
-
Propose a relational schema:
-
Categories table with self-join: id, parent_id, name, slug, path/materialized identifier, sort_order
-
Products table: id, sku, name, attributes JSON, status
-
ProductCategory join table
-
Describe indexes for fast tree traversal and product listing
-
Explain subtree query strategies and trade-offs:
-
Adjacency list with recursive CTEs
-
Materialized paths
-
Closure tables (ancestor/descendant pairs)
-
Address scaling:
-
Caching hot category pages and product lists
-
Pagination approach
-
Denormalized search indexes
-
Consistency choices
-
Write patterns (bulk imports)
-
Multi-region read replicas
-
Cover data integrity and change management:
-
Prevent cycles; enforce referential integrity
-
API versioning
-
Rollout plans for schema changes