Write SQL for revenue and advertiser analyses
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
Use the schema below and ANSI SQL. Treat “today” as 2025-09-01. Schema:
- active_ads(date DATE, ad_id INT, advertiser_id INT, creation_source VARCHAR, revenue DECIMAL(12,2))
- advertiser_info(advertiser_id INT, advertiser_name VARCHAR, advertiser_country VARCHAR)
Sample rows (minimal, for clarity):
active_ads
date | ad_id | advertiser_id | creation_source | revenue
2025-08-30 | 101 | 1 | web | 120.00
2025-08-30 | 102 | 1 | api | 80.00
2025-08-31 | 103 | 2 | web | 0.00
2025-09-01 | 104 | 3 | mobile | 200.00
2025-09-01 | 105 | 2 | web | 50.00
advertiser_info
advertiser_id | advertiser_name | advertiser_country
1 | Acme | US
2 | Globex | CA
3 | Initech | US
Tasks:
1) Daily revenue by creation_source: Return date, creation_source, daily_revenue where daily_revenue = SUM(revenue) over all ads for that source on that date. Order by date, creation_source. Include only dates present in active_ads.
2) Ten countries with the fewest active advertisers in the last 30 days: Define an “active advertiser” as one with at least one active_ads row with revenue > 0 and date between 2025-08-03 and 2025-09-01 inclusive. Return the 10 countries (from advertiser_info) with the smallest count of distinct active advertisers, including countries with zero active advertisers (show count 0). Break ties by advertiser_country alphabetically. Output columns: advertiser_country, active_advertiser_count.
3) Growth proportion by creation_source: For each creation_source, compute the proportion of advertisers whose spend in 2025-01-01..2025-09-01 exceeds their spend in 2024-01-01..2024-09-01 by at least 1000. Denominator = number of advertisers who had any spend (>0) in either period for that same creation_source. Output: creation_source, num_grew_by_1000, denom, proportion (rounded to 4 decimals). Provide a single query (CTEs allowed) that handles advertisers present in only one of the two periods by treating missing-period spend as 0.
Quick Answer: This question evaluates a candidate's proficiency in SQL data manipulation and analytics, including aggregation, joins, date-range filtering, handling missing-period values, distinct counts, and proportional calculations within the Data Manipulation (SQL/Python) domain.