Discuss productionizing event-timeout detector
Company: Applied Intuition
Role: Software Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
Define and evaluate a production-ready architecture for the event-timeout detector. Specify the external API and data model (schemas, idempotency keys), time semantics (processing vs event time), and handling of clock skew. Describe sharding strategy by event_id, state storage choice, consistency model for reads, and failure recovery. Discuss performance optimizations (e.g., timer wheels vs heaps/LRU, batching), backpressure, and memory limits. Enumerate production edge cases (late/dropped/duplicated/reordered messages, very large timeouts, long silences, restarts) and how you would test, monitor, and alert.
Quick Answer: This question evaluates system design and distributed-systems competencies, specifically the ability to architect a production-ready event-timeout detector with attention to time semantics, partitioning and state management, consistency and recovery, performance and operational observability.