You work on a microblogging app (Twitter-like). Posts and replies are stored in the tables below.
all_postpost_id
BIGINT
PK
post_author_id
BIGINT
FK → user.user_id
(author of this post/reply)
post_creation_time
TIMESTAMP (assume UTC)
post_type
VARCHAR (values:
'post'
for an original post,
'reply'
for a reply)
post_content
TEXT
post_parent_id
BIGINT NULL
FK → all_post.post_id
post_type='reply'
,
post_parent_id
points to the post being replied to.
post_type='post'
,
post_parent_id
is NULL.
useruser_id
BIGINT
PK
age
INT
country
VARCHAR (e.g.,
'US'
,
'CA'
, ...)
For each original post (post_type='post'), consider replies to that post where the reply was created within 7 days (inclusive) after the original post’s post_creation_time.
Return the number of distinct users (post_author_id of the original post) who have at least one original post that received ≥ 2 replies within that 7-day window.
Output:
num_users
A user “receives a reply” if any of their original posts received a reply.
Compute the percentage of users who received replies from at least two distinct repliers where both repliers are US users (user.country='US'). Repliers are the authors of the reply posts.
Output:
pct_users
(as a percentage from 0 to 100)
Clarification: Use as denominator the number of distinct users who received at least one reply (i.e., exclude users who never received a reply).