Design a DAU/MAU metrics system
Company: Vanta
Role: Software Engineer
Category: System Design
Difficulty: medium
Interview Round: Onsite
##### Question
Design a system that computes and serves product engagement metrics — **DAU** (Daily Active Users) and **MAU** (Monthly Active Users) — for a large consumer application.
An "active user" is a distinct user who performed at least one qualifying event (e.g., app open, page view, login, session start) within the time window.
Your design should address:
1. **Metric definitions.** DAU is the number of unique active users per calendar day. For MAU, choose and justify one interpretation — calendar-month uniques or a rolling 30-day window — and explain the trade-off with the other.
2. **Event ingestion and data modeling.** Accept activity events from both web and mobile clients. Specify the required event fields.
3. **De-duplication and identity.** Handle duplicate events; define the canonical user key across `user_id`, `device_id`/`anonymous_id`, and logged-out users.
4. **Accurate DAU/MAU computation.** Address time-zone/day boundaries, late-arriving events, out-of-order events, and backfills.
5. **Storage and compute choices at large scale.** Justify your streaming vs. batch and exact vs. approximate counting decisions.
6. **Serving queries and dashboards.** Support near-real-time dashboards for product and leadership teams plus historical queries by date range, platform, country, and app version. Discuss latency, caching, and correctness guarantees.
7. **Monitoring, data quality, failure handling, and privacy.** Cover anomaly detection, reconciliation, fault tolerance, and PII/retention controls.
Quick Answer: A Vanta software-engineer system-design interview question: design a scalable pipeline to compute and serve DAU and MAU engagement metrics for a large consumer app. It tests event ingestion, identity and de-duplication, exact vs. approximate distinct counting (HyperLogLog/bitmaps), handling of time zones and late/out-of-order events, streaming-vs-batch correctness, dashboard serving, and monitoring/privacy.