Compute reply-based engagement metrics

Q: Compute reply-based engagement metrics

This is a Data Manipulation (SQL/Python) interview question from Tools For Humanity for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

You work on a microblogging app (Twitter-like). Posts and replies are stored in the tables below.

Tables

`all_post`

post_id BIGINT PK
post_author_id BIGINT FK → user.user_id (author of this post/reply)
post_creation_time TIMESTAMP (assume UTC)
post_type VARCHAR (values: 'post' for an original post, 'reply' for a reply)
post_content TEXT
post_parent_id BIGINT NULL FK → all_post.post_id
- For post_type='reply' , post_parent_id points to the post being replied to.
- For post_type='post' , post_parent_id is NULL.

`user`

user_id BIGINT PK
age INT
country VARCHAR (e.g., 'US' , 'CA' , ...)

Task A

For each original post (post_type='post'), consider replies to that post where the reply was created within 7 days (inclusive) after the original post’s post_creation_time.

Return the number of distinct users (post_author_id of the original post) who have at least one original post that received ≥ 2 replies within that 7-day window.

Output:

num_users

Task B

A user “receives a reply” if any of their original posts received a reply.

Compute the percentage of users who received replies from at least two distinct repliers where both repliers are US users (user.country='US'). Repliers are the authors of the reply posts.

Output:

pct_users (as a percentage from 0 to 100)

Clarification: Use as denominator the number of distinct users who received at least one reply (i.e., exclude users who never received a reply).

Compute reply-based engagement metrics

Tables

all_post

user

Task A

Task B

Comments (0)

`all_post`

`user`