Assume today is 2025-09-01. Compute the top 3 shops by average daily visibility share over the last 7 days (2025-08-26 to 2025-09-01, inclusive) for US, non-bot traffic.
Definitions
-
Eligible session (per day): a distinct session_id from a US user (country='US') with is_bot=FALSE that had at least one search on that day with any result in the top 10 positions.
-
A shop is visible for a session on a day if it appears at least once in the top 10 in any search within that session on that day. Count each (shop, session_id, day) at most once, even if the shop appears multiple times or in multiple searches that day within the same session.
-
Daily visibility share for a shop = visible_sessions_for_shop_that_day / total_eligible_sessions_that_day.
-
Average daily visibility share = average of the daily visibility shares across the 7 days (treat missing days as zero visibility for that shop if the day had eligible sessions).
-
Tie-breaker: higher average CTR over the same 7-day window, where CTR = clicks/impressions for that shop restricted to top-10 results within eligible sessions. Deduplicate exact duplicate rows in search_results before counting impressions/clicks.
-
Ranking threshold: exclude shops with <100 eligible sessions total across the 7 days; for the sample data below, ignore this threshold but keep all other rules.
Schema (standard SQL)
users(user_id INT PRIMARY KEY, country STRING)
shops(shop_id INT PRIMARY KEY, shop_name STRING, category STRING)
searches(search_id INT PRIMARY KEY, user_id INT, session_id STRING, search_ts TIMESTAMP, query STRING, is_bot BOOLEAN)
search_results(search_id INT, position INT, shop_id INT, clicked BOOLEAN)
Sample data (minimal)
users
| user_id | country |
| 1 | US |
| 2 | US |
| 3 | CA |
shops
| shop_id | shop_name | category |
| 101 | Alpha Apparel | Clothing |
| 102 | Beta Bikes | Sports |
| 103 | Gamma Gadgets | Electronics |
searches
| search_id | user_id | session_id | search_ts | query | is_bot |
| 1001 | 1 | s1 | 2025-08-27 09:00:00 | jackets | FALSE |
| 1002 | 2 | s2 | 2025-08-27 09:05:00 | bike | FALSE |
| 1003 | 3 | s3 | 2025-08-27 09:10:00 | gadgets | FALSE |
| 1004 | 1 | s1 | 2025-08-27 09:12:00 | jackets | FALSE |
| 1005 | 2 | s4 | 2025-09-01 10:00:00 | helmet | FALSE |
| 1006 | 2 | s5 | 2025-09-01 10:02:00 | helmet | TRUE |
search_results (deduplicate exact duplicate rows before use)
| search_id | position | shop_id | clicked |
| 1001 | 1 | 101 | TRUE |
| 1001 | 2 | 103 | FALSE |
| 1002 | 1 | 102 | TRUE |
| 1003 | 1 | 103 | FALSE |
| 1004 | 1 | 101 | FALSE |
| 1004 | 2 | 103 | TRUE |
| 1005 | 1 | 102 | FALSE |
| 1005 | 11 | 103 | TRUE |
| 1005 | 1 | 102 | FALSE | <-- duplicate row to test dedup
| 1006 | 1 | 102 | TRUE |
Tasks
-
Write a single standard-SQL query (CTEs allowed) that returns the top 3 shops by average daily visibility share over 2025-08-26..2025-09-01 for US, non-bot traffic, applying all the rules above. Output: shop_id, shop_name, avg_daily_visibility_share, tie_breaker_avg_ctr (both as decimals rounded to 4 d.p.).
-
Briefly explain how your query avoids double-counting within a session/day and how it handles days with zero eligible sessions.
-
Using just the sample data (and ignoring the 100-session filter), what result would your query return?