This question evaluates proficiency in SQL-based data manipulation and analytics, specifically aggregations, joins, date-based grouping, ratio calculations, and windowed moving averages to compute safety and engagement metrics.
You are working with a conversational product that has users, AI characters, and conversations. Write SQL queries for each of the tasks below.
Assume UTC timestamps and that day means DATE(created_at).
usersuser_id
BIGINT PRIMARY KEY
created_at
TIMESTAMP
characterscharacter_id
BIGINT PRIMARY KEY
created_by_user_id
BIGINT REFERENCES
users(user_id)
created_at
TIMESTAMP
safety_flag
BOOLEAN
TRUE
= safe,
FALSE
= unsafe
conversationsEach row represents a user engaging in a conversation associated with a character.
conversation_id
BIGINT
character_id
BIGINT REFERENCES
characters(character_id)
user_id
BIGINT REFERENCES
users(user_id)
created_at
TIMESTAMP
safety_flag
BOOLEAN
TRUE
= safe,
FALSE
= unsafe
Notes:
conversation_id
may appear multiple times if multiple users engaged.
Define “top” as the characters with the largest number of conversation engagements (rows in conversations) across all time.
Return:
character_id
engagement_count
Order by engagement_count DESC and return only the top 100.
Compute the overall ratio of unsafe characters among all characters.
Return:
unsafe_character_ratio
(a decimal between 0 and 1)
Where:
characters.safety_flag = FALSE
For each day (based on characters.created_at), compute:
daily_unsafe_pct
= (# characters created that day that are unsafe) / (total # characters created that day)
daily_unsafe_pct
(including the current day).
Return:
day
daily_unsafe_pct
daily_unsafe_pct_ma7
Order by day ascending.
Create a 2×2 breakdown of conversation engagements by:
characters.safety_flag
)
conversations.safety_flag
)
Return:
character_safety_flag
conversation_safety_flag
engagement_count
For each day (based on conversations.created_at), compute metrics related to users who engaged in unsafe conversations (conversations.safety_flag = FALSE).
Return:
day
unsafe_conversation_engagements
(count of rows where
conversations.safety_flag = FALSE
)
unsafe_users
(distinct users engaging in unsafe conversations)
total_engagements
(all conversation rows that day)
total_users
(distinct users that day)
unsafe_engagement_ratio
=
unsafe_conversation_engagements / total_engagements
unsafe_user_ratio
=
unsafe_users / total_users
Order by day ascending.