PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Instacart

Design product catalog service

Last updated: Jun 15, 2026

Quick Overview

Instacart software-engineer onsite system design: design a product catalog service for a large e-commerce/grocery platform. Covers a self-referential category hierarchy, CRUD/browse/search APIs, the relational schema and indexes, subtree query strategies (adjacency CTE vs. materialized path vs. closure table), caching, consistency via outbox/CDC, and scaling to ~10M products at ~1k QPS.

  • hard
  • Instacart
  • System Design
  • Software Engineer

Design product catalog service

Company: Instacart

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

##### Question Design a product catalog service for a large e-commerce / grocery platform. The catalog must support a hierarchical category tree implemented with a self-referential (self-join) `categories` table, products that can belong to multiple categories, and product variants/SKUs with flexible attributes. Walk through the full design, covering each of the following: 1. **High-level APIs.** Define the read/write endpoints: product CRUD; category CRUD with hierarchy operations (`createCategory`, `moveCategory`, `listChildren`, `getBreadcrumbs`); variant/SKU CRUD; product-to-category assignment; browse products within a category (with the option to include descendants); attribute/facet search and filter; product detail fetch; and asynchronous bulk import with job tracking. Show example request/response payloads, pagination, and filtering. 2. **Relational schema (with the category self-join).** Specify tables, key fields, constraints, and indexes: a `categories` table that self-references via `parent_id` (plus `name`, `slug`/`path`, `sort_order`, `depth`); a `products` table; a `product_categories` many-to-many join; `product_variants` (size/color, price); and a flexible attribute model (JSONB or EAV) for filterable attributes. Discuss the indexes required for fast tree traversal and product listing. 3. **Subtree query strategies and trade-offs.** Explain how to efficiently query a full category subtree and breadcrumbs: adjacency list with recursive CTEs vs. materialized path (e.g., Postgres `ltree`) vs. closure tables (and why nested sets are usually avoided). Discuss the read/write trade-offs and when to pick each. 4. **Caching, consistency, and the search index.** Cover object/CDN/search caching with versioned keys and invalidation; the consistency model (transactional writes in the DB, eventual consistency for caches and the denormalized search index kept in sync via outbox/CDC); optimistic locking and idempotency. 5. **Scaling.** Show how the design scales to ~10M products at ~1k QPS reads (read-heavy): read replicas and multi-region reads, partitioning, denormalization of hot attributes, bulk-import write patterns, and pagination (keyset, not OFFSET). 6. **Data integrity, versioning, and rollout.** Cover cycle prevention, referential constraints, API versioning, and a zero-downtime (expand/contract) plan for schema changes.

Quick Answer: Instacart software-engineer onsite system design: design a product catalog service for a large e-commerce/grocery platform. Covers a self-referential category hierarchy, CRUD/browse/search APIs, the relational schema and indexes, subtree query strategies (adjacency CTE vs. materialized path vs. closure table), caching, consistency via outbox/CDC, and scaling to ~10M products at ~1k QPS.

Related Interview Questions

  • Design Asset Storage System - Instacart (medium)
  • Design an inventory management system - Instacart (medium)
  • Design cloud storage with quotas and compression - Instacart (hard)
  • Design scalable inventory system and avoid races - Instacart (hard)
  • Design an inventory system - Instacart (medium)
Instacart logo
Instacart
Aug 4, 2025, 10:55 AM
Software Engineer
Onsite
System Design
54
0
Question

Design a product catalog service for a large e-commerce / grocery platform. The catalog must support a hierarchical category tree implemented with a self-referential (self-join) categories table, products that can belong to multiple categories, and product variants/SKUs with flexible attributes. Walk through the full design, covering each of the following:

  1. High-level APIs. Define the read/write endpoints: product CRUD; category CRUD with hierarchy operations ( createCategory , moveCategory , listChildren , getBreadcrumbs ); variant/SKU CRUD; product-to-category assignment; browse products within a category (with the option to include descendants); attribute/facet search and filter; product detail fetch; and asynchronous bulk import with job tracking. Show example request/response payloads, pagination, and filtering.
  2. Relational schema (with the category self-join). Specify tables, key fields, constraints, and indexes: a categories table that self-references via parent_id (plus name , slug / path , sort_order , depth ); a products table; a product_categories many-to-many join; product_variants (size/color, price); and a flexible attribute model (JSONB or EAV) for filterable attributes. Discuss the indexes required for fast tree traversal and product listing.
  3. Subtree query strategies and trade-offs. Explain how to efficiently query a full category subtree and breadcrumbs: adjacency list with recursive CTEs vs. materialized path (e.g., Postgres ltree ) vs. closure tables (and why nested sets are usually avoided). Discuss the read/write trade-offs and when to pick each.
  4. Caching, consistency, and the search index. Cover object/CDN/search caching with versioned keys and invalidation; the consistency model (transactional writes in the DB, eventual consistency for caches and the denormalized search index kept in sync via outbox/CDC); optimistic locking and idempotency.
  5. Scaling. Show how the design scales to ~10M products at ~1k QPS reads (read-heavy): read replicas and multi-region reads, partitioning, denormalization of hot attributes, bulk-import write patterns, and pagination (keyset, not OFFSET).
  6. Data integrity, versioning, and rollout. Cover cycle prevention, referential constraints, API versioning, and a zero-downtime (expand/contract) plan for schema changes.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Instacart•More Software Engineer•Instacart Software Engineer•Instacart System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.