Compute shop visibility and intent metrics in SQL
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Schema (PostgreSQL). Tables:
users(user_id)
shops(shop_id, shop_name, merchant_type)
posts(post_id, shop_id, is_shoppable BOOLEAN, created_at TIMESTAMP)
impressions(user_id, post_id, impression_time TIMESTAMP)
clicks(user_id, post_id, click_time TIMESTAMP, click_type TEXT) -- e.g., 'merchant','like','comment'
Sample data (subset, UTC):
users
+---------+
| user_id |
+---------+
| 1 |
| 2 |
| 3 |
| 4 |
+---------+
shops
+---------+-----------+---------------+
| shop_id | shop_name | merchant_type |
+---------+-----------+---------------+
| 100 | Alpha | brand |
| 101 | Beta | marketplace |
+---------+-----------+---------------+
posts
+---------+---------+--------------+---------------------+
| post_id | shop_id | is_shoppable | created_at |
+---------+---------+--------------+---------------------+
| 10 | 100 | 1 | 2025-08-30 12:00:00 |
| 11 | 100 | 0 | 2025-08-31 09:00:00 |
| 12 | 101 | 1 | 2025-09-01 07:30:00 |
+---------+---------+--------------+---------------------+
impressions
+---------+---------+---------------------+
| user_id | post_id | impression_time |
+---------+---------+---------------------+
| 1 | 10 | 2025-09-01 10:00:00 |
| 1 | 10 | 2025-09-01 10:05:00 |
| 2 | 12 | 2025-09-01 12:00:00 |
| 3 | 10 | 2025-09-01 09:00:00 |
| 3 | 11 | 2025-09-01 11:00:00 |
| 4 | 12 | 2025-09-01 08:30:00 |
+---------+---------+---------------------+
clicks
+---------+---------+---------------------+-------------+
| user_id | post_id | click_time | click_type |
+---------+---------+---------------------+-------------+
| 1 | 10 | 2025-09-01 10:06:00 | merchant |
| 2 | 12 | 2025-09-01 12:20:00 | like |
| 3 | 11 | 2025-09-01 11:05:00 | merchant |
| 4 | 12 | 2025-09-01 08:45:00 | merchant |
+---------+---------+---------------------+-------------+
Tasks (write SQL; use window functions and CASE WHEN; be explicit about deduping and time zones; treat "today" as 2025-09-01):
A) For each shop and calendar date, compute ShopVisibilityScore = 100 * (distinct users who saw ≥1 impression of a shoppable post from that shop that date) / (distinct users with any impression that date). Ensure multiple impressions of the same shop per user per date are deduped. Also report that shop’s CTR that date = merchant clicks on that shop’s shoppable posts / impressions of those posts. Return shop_id, dt, visibility_score, ctr, impressions, unique_viewers.
B) Compute a 7-day rolling visibility_score and CTR per shop ending on 2025-09-01 (inclusive), using windows that avoid leakage across shops, and show the absolute and percentage change vs. the prior 7-day window.
C) Define an implementable PurchaseIntentRate using only the provided tables. Example: among users who had a first shoppable impression for a given shop on a date, the fraction who clicked a 'merchant' link within 60 minutes of that first impression. Write SQL to compute this per shop per date, carefully handling multiple posts, multiple clicks, and users who clicked on non-shoppable posts (should they count or be excluded? justify and implement).
Quick Answer: This question evaluates SQL-based data manipulation and analytical competencies in the Data Manipulation (SQL/Python) domain, focusing on deduplication, time-zone-aware timestamp handling, window functions, aggregation, and construction of visibility, CTR, and intent metrics for shops.