Analyze advertiser spend by source
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: medium
Interview Round: Onsite
You are given two tables:
**advertisers**
- `advertiser_id` BIGINT
- `advertiser_type` VARCHAR — examples: `smb`, `enterprise`, `agency`, `internal`
- `status` VARCHAR
**ad_spend_daily**
- `spend_date` DATE — UTC calendar date
- `advertiser_id` BIGINT
- `campaign_id` BIGINT
- `ad_id` BIGINT
- `creation_source` VARCHAR — examples: `manual`, `bulk_upload`, `ai_assisted`
- `spend_usd` DECIMAL(18,2)
`ad_spend_daily.advertiser_id` joins to `advertisers.advertiser_id`.
Assumptions:
- Use UTC year boundaries.
- "Last year" means `2023-01-01` through `2023-12-31`.
- "This year" means `2024-01-01` through `2024-12-31`.
- `spend_usd` is the amount spent on that day.
- If `advertiser_type` is `NULL`, keep the advertiser unless it is explicitly excluded.
Write SQL for the following tasks:
1. Identify the cohort of advertisers whose total spend in 2023 was greater than 1000 USD. For that cohort, compute the share of total 2024 platform spend contributed by the cohort. Return one row with these columns:
- `cohort_2024_spend_usd`
- `platform_2024_spend_usd`
- `cohort_2024_spend_share`
2. Build a monthly source-level spend view to investigate whether growth in one ad creation source could be explained by decline in another source. Exclude a supplied list of advertiser types such as `('agency', 'internal')`. Return these columns for each month and source:
- `month_start`
- `creation_source`
- `spend_usd`
- `spend_share_in_month`
- `distinct_advertisers`
The goal of the second query is to support an analysis of whether one source's growth may be offset by another source's decline after filtering out advertiser categories that should not be included.
Quick Answer: This question evaluates proficiency in data manipulation and analytics using SQL or Python, testing skills such as joins, time-based filtering, cohort identification, aggregation, percentage share calculations, and monthly source-level rollups.