SQL on ad safety. Assume the following schema and sample rows. Use ANSI SQL. Today is 2025-09-01; interpret “last 7 days” as 2025-08-26 00:00:00 to 2025-09-01 23:59:59 UTC.
Tables:
Sample data: advertisers +---------------+-----------+ | advertiser_id | name | +---------------+-----------+ | 101 | Alpha Co | | 102 | Beta LLC | | 103 | Gamma Inc | +---------------+-----------+
ads +-------+---------------+---------------------+ | ad_id | advertiser_id | created_at | +-------+---------------+---------------------+ | 1 | 101 | 2025-08-15 10:00:00 | | 2 | 101 | 2025-08-20 11:00:00 | | 3 | 102 | 2025-08-25 12:00:00 | | 4 | 103 | 2025-08-28 09:00:00 | +-------+---------------+---------------------+
ad_page_visits +----------+------+---------+---------------------+ | visit_id | ad_id| user_id | visited_at | +----------+------+---------+---------------------+ | 1 | 1 | 1001 | 2025-08-26 10:00:00 | | 2 | 1 | 1002 | 2025-08-26 10:05:00 | | 3 | 1 | 1003 | 2025-08-27 08:00:00 | | 4 | 2 | 1001 | 2025-08-28 12:00:00 | | 5 | 2 | 1004 | 2025-08-29 13:00:00 | | 6 | 3 | 1002 | 2025-08-30 14:00:00 | | 7 | 3 | 1005 | 2025-08-31 15:00:00 | | 8 | 4 | 1006 | 2025-08-31 20:00:00 | | 9 | 4 | 1002 | 2025-09-01 09:00:00 | | 10 | 4 | 1007 | 2025-09-01 10:30:00 | +----------+------+---------+---------------------+
ad_reports +-----------+------+---------+---------------------+-----------+ | report_id | ad_id| user_id | reported_at | reason | +-----------+------+---------+---------------------+-----------+ | 500 | 1 | 1002 | 2025-08-26 10:06:00 | misleading| | 501 | 1 | 1003 | 2025-08-27 08:02:00 | offensive | | 502 | 2 | 1004 | 2025-08-29 13:05:00 | spam | | 503 | 3 | 1005 | 2025-08-31 15:05:00 | scam | | 504 | 4 | 1006 | 2025-08-31 20:02:00 | offensive | +-----------+------+---------+---------------------+-----------+
Tasks: A) Define “top bad advertiser” as one meeting all of: at least 1,000 page visits and at least 50 unique reporters in the last 7 days; rank advertisers by report_rate = total_reports / total_visits (descending). Write a single query that outputs: advertiser_id, total_visits, total_reports, unique_reporters, report_rate, rank, keeping only the top 5 ranks. Use window functions (RANK or DENSE_RANK) and deterministic tiebreakers: total_reports DESC, advertiser_id ASC. B) Modify the query so that if an advertiser has multiple ads, you also output the worst ad_id per advertiser (highest ad-level report_rate over the last 7 days, with at least 100 visits), breaking ties by total_reports DESC then ad_id ASC. C) Return a daily leaderboard (one row per advertiser per day in the last 7 days) with that day’s rank by report_rate, but only over advertisers having ≥200 visits that day. Ensure advertisers with zero reports still appear (rate = 0) and days with no visits are excluded. D) Explain in comments how your query avoids double-counting when a user both visits and reports in the same minute, and how it handles reports without a corresponding visit row within the window.