You are analyzing discussions on a social platform.
Tables
all_post
-
post_id
(BIGINT, PK)
-
post_author_id
(BIGINT, FK →
user.user_id
)
-
post_creation_time
(TIMESTAMP, UTC)
-
post_type
(VARCHAR) — e.g.,
'root'
for an original post and
'reply'
for a reply
-
post_content
(TEXT)
-
post_parent_id
(BIGINT, nullable) — if this is a reply, points to the parent post’s
post_id
;
NULL
for root posts
user
-
user_id
(BIGINT, PK)
-
age
(INT)
-
country
(VARCHAR) — e.g.,
'US'
Definitions / Assumptions
-
A
reply
is a row in
all_post
with
post_parent_id IS NOT NULL
.
-
A user
receives a reply
when someone creates a reply whose
post_parent_id
points to a post authored by that user.
-
“Within 7 days” means
reply.post_creation_time
is in
[parent.post_creation_time, parent.post_creation_time + INTERVAL '7 days']
(inclusive). Use UTC.
Tasks
-
Count users who received at least two replies within 7 days
-
Count distinct
post_author_id
(the parent post’s author) who have
at least one parent post
that received
≥ 2 replies
within 7 days of that parent post’s creation.
-
Output: a single row with
num_users
.
-
Percentage of users who received replies from ≥ 2 distinct US users
-
For each recipient user (the parent post’s author), look at all replies to their posts (no time window unless you choose to state one).
-
A recipient qualifies if they received replies from
at least 2 distinct repliers
where the replier’s
user.country = 'US'
.
-
Denominator: all users who received
at least 1 reply
(from anyone).
-
Output: a single row with
pct_users
(as a decimal or percent; specify which in your query).
Note: Task (1) and task (2) have different qualifying criteria; be careful not to mix the populations.