You are given article view events and article metadata.
article_views(user_id INT, article_id INT, view_date DATE)
-
One row per article view event.
articles(article_id INT, article_type VARCHAR)
-
Maps each article to its type.
Join key: article_views.article_id = articles.article_id.
Assume view_date is a calendar date and timezone handling is not needed.
Tasks:
-
For
2019-01-01
, return the number of distinct article types viewed by each user.
-
Build a histogram of the number of distinct article types viewed per user across the full dataset. In other words, for each possible
num_article_types
, return how many users viewed exactly that many distinct article types.
Definitions:
-
If a user views multiple articles of the same
article_type
, count that type only once for that user.
-
For the histogram in task 2, use all available dates unless otherwise stated.
Required output:
-
Task 1:
user_id, num_article_types
-
Task 2:
num_article_types, num_users