This question evaluates a candidate's competency in SQL data manipulation and analytics, focusing on joins between event and reference tables, distinct-count aggregation, date-based filtering, and building distributional summaries by user.
You are given two tables about article views.
Table 1: article_views
user_id
INT
article_id
INT
view_date
DATE — UTC calendar date of the view
Table 2: articles
article_id
INT
article_type
STRING
Relationship:
article_views.article_id = articles.article_id
Assumptions:
article_views
represents one article view event.
Write SQL for the following:
2019-01-01
UTC, compute the number of distinct article types viewed by each user.
user_id, num_article_types
num_article_types
, return how many users have that count.
num_article_types, num_users