This question evaluates proficiency in SQL data manipulation—principally joins, aggregations, grouping, filtering, date arithmetic, and safe handling of division—and targets the Data Manipulation (SQL/Python) domain for a Data Scientist role; level of abstraction: practical application.
Assume the following marketplace tables.
Table: listing_interactions
buyer_id
STRING
seller_id
STRING
product_id
STRING
interaction_date
DATE
listing_interactions
INT — number of buyer-seller interactions for that product on that date
Table: dim_all_products
product_id
STRING
category
STRING
snapshot_date
DATE
create_date
DATE
country
STRING
Assumptions:
listing_interactions.product_id
joins to
dim_all_products.product_id
.
product_id
in
dim_all_products
represents one listing/product.
listing_interactions
, summed across all rows, is greater than 1.
create_date
is between
current_date - 6 days
and
current_date
, inclusive.
Write SQL to answer both questions below:
sellers_with_gt_3_multi_interaction_products
.
vehicle
category?
vehicle_interaction_pct
.
sum(listing_interactions for US listings created in the last 7 days where category = 'vehicle') / sum(listing_interactions for all US listings created in the last 7 days)
.