Compute active ad revenue by creation source
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Onsite
You work on an ads platform and need to report **active ad revenue** broken down by the ad’s **creation source**.
## Tables
### `ads`
- `ad_id` BIGINT **PK**
- `advertiser_id` BIGINT
- `creation_source` VARCHAR
*Examples:* `'SELF_SERVE'`, `'MANAGED'`, `'API'`
- `created_at` TIMESTAMP
### `ad_impressions`
- `impression_id` BIGINT **PK**
- `ad_id` BIGINT **FK → ads.ad_id**
- `impression_time` TIMESTAMP
### `ad_revenue_events`
- `event_id` BIGINT **PK**
- `ad_id` BIGINT **FK → ads.ad_id**
- `event_time` TIMESTAMP
- `revenue_usd` DECIMAL(18,6)
*(Assume revenue is recorded at the time it is earned, e.g., per impression/click.)*
## Definitions / assumptions
- Timezone: **UTC**.
- A paid ad is **active on a day** if it has **≥ 1 impression** that day.
- **Active ad revenue on a day** = sum of `revenue_usd` from `ad_revenue_events` that occurred that day **for ads that are active that same day**.
## Task
Write a SQL query to compute, for each calendar day in a given date range (e.g., `:start_date` to `:end_date`, inclusive):
- `report_date`
- `creation_source`
- `active_ads` (count of distinct active `ad_id`)
- `active_ad_revenue_usd` (sum of revenue for active ads)
Order results by `report_date`, then `creation_source`.
Quick Answer: This question evaluates proficiency in data manipulation and analytics, testing skills such as joining event and impression tables, aggregating time-series revenue, and counting distinct active ads broken down by creation source.