PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Snowflake

Design under vague distributed requirements

Last updated: Mar 29, 2026

Quick Overview

This question evaluates expertise in distributed system architecture, metadata and schema management, transactional DDL semantics, consistency and replication strategies, API and data-model design, multi-tenant isolation, observability, security, and capacity planning.

  • hard
  • Snowflake
  • System Design
  • Software Engineer

Design under vague distributed requirements

Company: Snowflake

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

You are given a vague distributed-systems prompt. Show how you would clarify requirements and then design the system end to end: define the core use cases and SLAs, propose external APIs, select a data model, choose a sharding and replication strategy, justify a consistency model (strong vs eventual), and explain read/write paths. Describe failure handling (timeouts, retries, idempotency), leader election, backpressure, hot-key mitigation, caching, schema evolution, observability (metrics, logs, tracing), security, and capacity planning. Present a high-level architecture and discuss the key trade-offs you made.

Quick Answer: This question evaluates expertise in distributed system architecture, metadata and schema management, transactional DDL semantics, consistency and replication strategies, API and data-model design, multi-tenant isolation, observability, security, and capacity planning.

Related Interview Questions

  • Design a Cron Job Scheduler - Snowflake (medium)
  • Design a disk-backed KV store under contention - Snowflake (easy)
  • Design an ACL authorization checking service - Snowflake (hard)
  • Design an object store with deduplication - Snowflake (medium)
  • Design a distributed system end-to-end - Snowflake (hard)
Snowflake logo
Snowflake
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
3
0

System Design Prompt: Distributed Metadata Catalog and Schema Registry

Context

Design a multi-tenant distributed Metadata Catalog and Schema Registry for an analytics platform. The service manages databases, schemas, tables, columns, views, roles, and grants. It must support high read throughput, transactional DDL updates, and change notifications to downstream systems (e.g., query planner, caches).

Begin by clarifying requirements (functional and non-functional), then design the system end to end.

Tasks

  1. Clarify requirements
    • Enumerate core use cases and access patterns.
    • Define SLAs/SLOs (latency, availability, durability, multi-region needs, data retention).
    • Specify consistency expectations and isolation levels.
    • Identify multi-tenancy constraints and per-tenant limits.
  2. External APIs
    • Propose REST/gRPC endpoints for CRUD on entities, conditional updates, transactions, and change-subscription.
    • Show request/response shapes at a high level, including idempotency and versioning.
  3. Data model
    • Define entities and relationships (normalized vs denormalized).
    • Describe versioning, soft deletes, and change-log design.
  4. Sharding and replication strategy
    • Explain partitioning keys and how to minimize cross-shard transactions.
    • Choose replication factor, placement (AZ/region), and read/write topology.
  5. Consistency model
    • Justify strong vs eventual consistency per operation type.
    • Describe transaction approach (single-shard vs multi-shard).
  6. Read/write paths
    • Describe end-to-end request flow for reads and writes, including cache interaction and index maintenance.
  7. Failure handling
    • Timeouts, retries, idempotency, partial-failure handling, dead letter policies.
    • Leader election and membership changes.
  8. Load management
    • Backpressure/admission control, hot-key mitigation, rate limiting.
    • Caching strategy (L1/L2, invalidation/signaling).
  9. Schema evolution
    • Compatibility rules, online migrations, and rolling upgrades.
  10. Observability
    • Key metrics, logs, tracing, and alerting tied to SLOs/error budgets.
  11. Security and compliance
    • Authentication, authorization (RBAC), encryption, audit logging, tenant isolation.
  12. Capacity planning
    • Provide a simple sizing model with example numbers (QPS, storage, replication overhead, headroom).
  13. High-level architecture and trade-offs
    • Present the component diagram verbally and discuss major trade-offs (latency vs availability, complexity vs robustness, cost vs performance).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Snowflake•More Software Engineer•Snowflake Software Engineer•Snowflake System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.