Design Webhooks and Metric Aggregation
Company: Nuro
Role: Backend Engineer
Category: System Design
Difficulty: hard
Interview Round: Onsite
The interview report described two system design rounds in the same category:
1. **Design a webhook delivery platform**
Build a multi-tenant webhook system for external customers.
Internal product services emit business events such as `order.created`, `order.updated`, and `trip.completed`. Customers can register HTTPS endpoints and subscribe to selected event types. The system should:
- accept event publications from internal services
- fan out events to subscribed customer endpoints
- deliver webhooks with low latency
- support retries with exponential backoff on failures
- provide at-least-once delivery semantics
- include payload signing and secret rotation
- help consumers deduplicate repeated deliveries
- expose delivery status, logs, and metrics
- isolate tenants and protect the platform from slow or failing endpoints
Discuss APIs, storage, queueing, retry strategy, security, observability, and scaling bottlenecks.
2. **Design a vehicle metrics aggregation system**
Build a telemetry platform for a fleet of connected vehicles. Each vehicle sends measurements such as speed, battery level, GPS location, temperature, and health signals every few seconds.
The system should:
- ingest high-volume telemetry from many vehicles
- store raw events for debugging and historical analysis
- compute rollups such as per-vehicle 1-minute aggregates and fleet-wide hourly aggregates
- support dashboards and queries over recent and historical data
- support alerting for anomalies or threshold violations
- handle late, duplicated, or out-of-order events
- support retention and downsampling policies
Discuss ingestion, stream processing, storage choices, aggregation strategy, query serving, and fault tolerance.
Quick Answer: This question evaluates engineering skills in designing scalable, reliable backend systems, covering competencies such as multi-tenant event delivery (webhooks), high-throughput telemetry ingestion, stream processing, storage and aggregation, security, and observability.