You work on a microblogging app (Twitter-like). Posts and replies are stored in the tables below.
Tables
all_post
-
post_id
BIGINT
PK
-
post_author_id
BIGINT
FK → user.user_id
(author of this post/reply)
-
post_creation_time
TIMESTAMP (assume UTC)
-
post_type
VARCHAR (values:
'post'
for an original post,
'reply'
for a reply)
-
post_content
TEXT
-
post_parent_id
BIGINT NULL
FK → all_post.post_id
-
For
post_type='reply'
,
post_parent_id
points to the post being replied to.
-
For
post_type='post'
,
post_parent_id
is NULL.
user
-
user_id
BIGINT
PK
-
age
INT
-
country
VARCHAR (e.g.,
'US'
,
'CA'
, ...)
Task A
For each original post (post_type='post'), consider replies to that post where the reply was created within 7 days (inclusive) after the original post’s post_creation_time.
Return the number of distinct users (post_author_id of the original post) who have at least one original post that received ≥ 2 replies within that 7-day window.
Output:
Task B
A user “receives a reply” if any of their original posts received a reply.
Compute the percentage of users who received replies from at least two distinct repliers where both repliers are US users (user.country='US'). Repliers are the authors of the reply posts.
Output:
-
pct_users
(as a percentage from 0 to 100)
Clarification: Use as denominator the number of distinct users who received at least one reply (i.e., exclude users who never received a reply).