Design cache for DAG-based query views
Company: Snowflake
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Onsite
Your analytics system computes materialized query views where each view depends on other views, forming a DAG. Design a caching strategy to reduce query latency. Specify: what to cache (granularity: base tables, partial aggregates, whole views), where to cache (in-memory, local disk, distributed cache), cache keys and versioning, eviction policy, freshness SLAs, and consistency/invalidation when base data updates. Explain how updates propagate through the DAG (e.g., incremental vs. full recomputation), how to handle partial failures, backfills, and hot keys, and how you would monitor correctness and performance.
Quick Answer: This question evaluates a candidate's understanding of caching, consistency, versioning, eviction, and update propagation in DAG-based materialized views, testing competencies in distributed systems, data engineering, and storage-hierarchy design.