Compute DAU and rolling MAU with zero days
Company: Glean
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You have two tables in PostgreSQL:
### Tables
**`users`**
- `user_id` (STRING / INT, PK)
- `signup_date` (DATE)
**`logins`**
- `user_id` (STRING / INT, FK → `users.user_id`)
- `browser` (TEXT)
- `login_ts` (TIMESTAMP) — assume UTC unless stated otherwise
### Tasks
1) **Daily Active Users (DAU)**: For a given date range (e.g., `2023-01-01` to `2023-12-31`), compute DAU where DAU on a day = number of distinct `user_id` that logged in on that calendar day.
- Output columns: `date`, `dau`
- Requirement: **Include all dates in the range**, even if there were zero logins that day (show `dau = 0`).
- Follow-up: If you need a date dimension/calendar, explain how you would create it in PostgreSQL (e.g., via `generate_series`).
2) **Rolling MAU by date**: Compute a **rolling-window MAU** for each calendar date in the range.
- Define MAU as **distinct users who logged in in the last 30 days including the current date** (an L30D lookback).
- Output columns: `date`, `mau_l30d`
- Follow-up discussion: If the product is primarily used on weekdays (little/no weekend usage), what are potential issues with an L30D definition, and what alternative window definitions could be more appropriate?
Quick Answer: This question evaluates a data scientist's proficiency in time-series analytics and data aggregation, focusing on computing daily active users and a 30-day rolling MAU from user and login tables, and falls under the Data Manipulation (SQL/Python) domain.