Design an ads event reporting system
Company: Pinterest
Role: Software Engineer
Category: System Design
Difficulty: medium
Interview Round: Onsite
Design an **ads event reporting system** that collects user-ad interaction events and serves aggregated metrics.
### Requirements
1. **Ingest events** from multiple platforms (web/mobile/server):
- `impression`, `click`, `conversion` (extendable to more event types)
2. **Event schema** includes at minimum:
- `event_id` (unique), `timestamp` (event time), `ingest_time`, `user_id` (or anonymized id), `ad_id`, `campaign_id`, `event_type`, plus optional attributes (geo, device, app version).
3. **Reporting queries**:
- Time-series aggregates (e.g., per minute/hour/day)
- Group-by dimensions: campaign/ad, geo, device
- **User cohort / segment partitioning** (e.g., users grouped by country, or an offline-defined cohort id)
4. **Correctness and reliability**:
- Handle duplicates (retries), out-of-order and late events
- Support backfills and reprocessing
5. **Latency goals** (assume):
- Near-real-time dashboards (e.g., p95 < 1–5 minutes)
- Accurate daily finalized reports
6. **Scale goals** (assume):
- 50k–500k events/sec peak
- Store raw events for 30–90 days; aggregates longer
### Deliverables
Propose an end-to-end architecture, storage layout/partitioning, aggregation strategy, APIs for querying, and how you ensure deduplication + correctness with late events. Include key tradeoffs.
Quick Answer: This question evaluates a candidate's ability to design scalable, reliable event ingestion and analytics systems, testing competencies in distributed data pipelines, stream and batch aggregation, storage partitioning, API/query design, and correctness concerns such as deduplication and handling late or out-of-order events.