Compute CTR overall and by campaign type
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Write SQL to compute: (Q1) overall click-through rate (CTR = clicks/impressions) in the last week; (Q2) CTR by campaign_type in the last week. Assume 'today' is 2025-09-01; define 'last week' as 2025-08-25 00:00:00 to 2025-08-31 23:59:59 (UTC). Deduplicate identical logs by (ad_id, view_id, event) within the window. Prevent division-by-zero. Schema and sample data:
Tables
- ad_events(ad_id INT, campaign_id INT, event STRING, view_id STRING, event_time TIMESTAMP) -- event in {'impression','click'}
- campaigns(campaign_id INT, campaign_type STRING) -- campaign_type in {'direct_response','brand'}
Sample rows (minimal):
ad_events
ad_id | campaign_id | event | view_id | event_time
101 | 1 | impression | v1001 | 2025-08-28 10:00:00
101 | 1 | click | v1001 | 2025-08-28 10:00:05
102 | 2 | impression | v1002 | 2025-08-30 12:00:00
102 | 2 | impression | v1003 | 2025-08-30 12:05:00
103 | 2 | click | v1002 | 2025-08-30 12:00:07
104 | 3 | impression | v1004 | 2025-08-26 08:00:00
campaigns
campaign_id | campaign_type
1 | direct_response
2 | brand
3 | direct_response
Q1 (overall CTR): return a single row with overall_ctr.
Q2 (by type): return campaign_type, clicks, impressions, ctr.
Edge cases: if impressions=0, return ctr=0.0; only count events inside the window; treat multiple logs of the same (ad_id, view_id, event) as one.
Example solution sketch (you may adapt to your SQL dialect):
WITH dedup AS (
SELECT DISTINCT ad_id, campaign_id, view_id, event
FROM ad_events
WHERE event_time >= '2025-08-25' AND event_time < '2025-09-01 24:00:00'
),
imp AS (SELECT COUNT(*) AS imps FROM dedup WHERE event='impression'),
clk AS (SELECT COUNT(*) AS clks FROM dedup WHERE event='click')
SELECT CASE WHEN imps=0 THEN 0.0 ELSE clks*1.0/imps END AS overall_ctr
FROM imp CROSS JOIN clk;
-- By campaign_type
WITH dedup AS (
SELECT DISTINCT ad_id, campaign_id, view_id, event
FROM ad_events
WHERE event_time >= '2025-08-25' AND event_time < '2025-09-01 24:00:00'
),
agg AS (
SELECT campaign_id,
SUM(CASE WHEN event='impression' THEN 1 ELSE 0 END) AS imps,
SUM(CASE WHEN event='click' THEN 1 ELSE 0 END) AS clks
FROM dedup
GROUP BY campaign_id
)
SELECT c.campaign_type,
COALESCE(a.clks,0) AS clicks,
COALESCE(a.imps,0) AS impressions,
CASE WHEN COALESCE(a.imps,0)=0 THEN 0.0 ELSE COALESCE(a.clks,0)*1.0/COALESCE(a.imps,0) END AS ctr
FROM campaigns c
LEFT JOIN agg a ON a.campaign_id = c.campaign_id
ORDER BY c.campaign_type;
Quick Answer: This question evaluates data manipulation and analytics skills, specifically SQL aggregations, deduplication of event logs, time-window filtering, joins to campaign metadata, and safe ratio computation including division-by-zero handling.