You are given three tables:
1) impression
Event-level table of user impressions.
-
impression_id
BIGINT (PK)
-
user_id
BIGINT (FK →
user.user_id
)
-
pin_id
BIGINT (FK →
pin_info.pin_id
)
-
surface
VARCHAR (e.g., 'home', 'search', 'profile', etc.)
-
impression_ts
TIMESTAMP (when the impression happened)
-
country_code
VARCHAR (ISO country code; assume it is populated for the event)
2) user
-
user_id
BIGINT (PK)
-
created_ts
TIMESTAMP
3) pin_info
-
pin_id
BIGINT (PK)
-
category
VARCHAR
Assumptions:
-
Use UTC for date boundaries.
-
“Today” means
DATE(impression_ts) = CURRENT_DATE
.
Task A
For each country, find the pin category with the highest number of impressions today.
Requirements:
-
If there is a tie for the highest impression count, return
all tied categories
for that country.
-
Output columns:
-
country_code
-
category
-
impression_cnt
Task B
For each country, compute the percent of active users who are highly active.
Definitions (within the last 7 days, inclusive of today):
-
Active user
: a user with
≥ 1 impression event
.
-
Highly active user
: a user who satisfies
both
:
-
Active on
≥ 4 distinct days
(based on
DATE(impression_ts)
), AND
-
On
at least one day
, used
≥ 3 distinct surfaces
(based on distinct
surface
values that day).
Requirements:
-
Output columns:
-
country_code
-
active_users
-
highly_active_users
-
pct_highly_active
(as a decimal or percent; specify which in your answer)
-
Clearly handle divide-by-zero when a country has 0 active users in the window.