PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Snowflake

Design cache for DAG-based query views

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's understanding of caching, consistency, versioning, eviction, and update propagation in DAG-based materialized views, testing competencies in distributed systems, data engineering, and storage-hierarchy design.

  • hard
  • Snowflake
  • System Design
  • Software Engineer

Design cache for DAG-based query views

Company: Snowflake

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

Your analytics system computes materialized query views where each view depends on other views, forming a DAG. Design a caching strategy to reduce query latency. Specify: what to cache (granularity: base tables, partial aggregates, whole views), where to cache (in-memory, local disk, distributed cache), cache keys and versioning, eviction policy, freshness SLAs, and consistency/invalidation when base data updates. Explain how updates propagate through the DAG (e.g., incremental vs. full recomputation), how to handle partial failures, backfills, and hot keys, and how you would monitor correctness and performance.

Quick Answer: This question evaluates a candidate's understanding of caching, consistency, versioning, eviction, and update propagation in DAG-based materialized views, testing competencies in distributed systems, data engineering, and storage-hierarchy design.

Related Interview Questions

  • Design a Cron Job Scheduler - Snowflake (medium)
  • Design a disk-backed KV store under contention - Snowflake (easy)
  • Design an ACL authorization checking service - Snowflake (hard)
  • Design an object store with deduplication - Snowflake (medium)
  • Design a distributed system end-to-end - Snowflake (hard)
Snowflake logo
Snowflake
Sep 6, 2025, 12:00 AM
Software Engineer
Onsite
System Design
5
0

System Design: Caching Strategy for a DAG of Materialized Views

Context

You are designing an analytics system that computes materialized query views. Views depend on other views (and base tables), forming a directed acyclic graph (DAG). The goal is to reduce query latency while maintaining correctness and controllable freshness.

Assume:

  • Base data lives in durable storage and is updated in micro-batches or streaming (CDC).
  • Queries commonly read recent time windows and hot dimensions.
  • The system is multi-tenant and runs on a fleet of compute nodes with local memory and disk; a shared distributed cache is available.

Task

Design a caching strategy for this system. Specify:

  1. What to cache
    • Choose cache granularity among: base tables (blocks/columns), partial aggregates (intermediate DAG nodes), and full view results. Explain trade-offs.
  2. Where to cache
    • In-memory (per-node), local SSD, and/or distributed cache. Propose a multi-tier policy.
  3. Cache keys and versioning
    • Define keys so cached entries are uniquely and correctly identified. Include versioning that ties entries to specific input data snapshots.
  4. Eviction policy
    • Specify the algorithm(s), admission control, quotas, and safeguards against thrash.
  5. Freshness SLAs
    • Define staleness guarantees (e.g., strong vs bounded) and how queries select acceptable cached versions.
  6. Consistency and invalidation on updates
    • How caches are invalidated/updated when base data changes. Describe mechanisms to avoid stale or inconsistent joins across views.
  7. Update propagation through the DAG
    • Explain incremental vs full recomputation, ordering, and how lineage is tracked.
  8. Failure handling
    • Partial failures, retries, circuit breakers, and fallback behavior.
  9. Backfills
    • Strategy for large historical recomputations without disrupting hot-path queries.
  10. Hot keys and load shedding
    • Handling hotspots, thundering herds, and skew.
  11. Monitoring and validation
    • Metrics and techniques to verify correctness and performance.

Provide a step-by-step, implementation-oriented design with examples where helpful.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Snowflake•More Software Engineer•Snowflake Software Engineer•Snowflake System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.