Compute seller counts and vehicle share
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: medium
Interview Round: Technical Screen
You are given two tables:
1. `listing_interactions`
- `buyer_id` BIGINT
- `seller_id` BIGINT
- `event_date` DATE
- `product_id` BIGINT
- `listing_interactions` INT — number of buyer-seller interactions for that product on that date
2. `dim_all_products`
- `product_id` BIGINT
- `category` STRING
- `snapshot_date` DATE
- `create_date` DATE
- `country` STRING
Assumptions:
- Join the tables on `product_id`.
- All dates are stored in UTC.
- Unless otherwise specified, use all available data.
- For Question 1, define a product as having **multiple interactions** if the total `listing_interactions` summed across all rows for the same `(seller_id, product_id)` is greater than 1.
- For Question 2, define **newly created listings in the US** as products with `country = 'US'` and `create_date` in the last 7 calendar days, inclusive of `CURRENT_DATE`.
- For Question 2, the numerator and denominator should both use `SUM(listing_interactions)`.
Write SQL for the following two tasks:
1. How many sellers have **more than 3 distinct products** that each have multiple interactions?
- Return one row with column: `seller_count`
2. Among newly created US listings from the last 7 days, what percentage of total listing interactions comes from the `vehicle` category?
- Return one row with column: `vehicle_interaction_pct`
- Express the result as a percentage from 0 to 100.
Quick Answer: This question evaluates data manipulation and analytical SQL competencies such as joining tables, aggregating and deduplicating interaction counts, applying date-window filters, and computing category-level percentage metrics.