Problem (SQL)
You work on a music app and want to recommend new friend connections based on listening similarity.
Tables
Assume the following schemas:
listens
-
user_id
(BIGINT)
-
song_id
(BIGINT)
-
listened_at
(TIMESTAMP)
friendships (undirected friendship; may be stored as one row per pair)
-
user_id_1
(BIGINT)
-
user_id_2
(BIGINT)
-
created_at
(TIMESTAMP)
Definitions / assumptions
-
Two users have an
overlapping song
on a given day if
both
listened to the
same
song_id
on that
same calendar date
.
-
Count
distinct
overlapping songs per pair per day (ignore repeat listens of the same song).
-
Treat pairs as
unordered
(e.g., output
(min_user, max_user)
).
-
A pair should be recommended only if they are
not already friends
(no matching pair in
friendships
, regardless of ordering).
-
Unless stated otherwise, treat
listened_at
as UTC when deriving the calendar
date
.
Task
Write a SQL query to return all recommended pairs of users (user_id_a, user_id_b) such that there exists at least one day where they listened to more than 3 of the same songs on that day.
Output
Return:
-
user_id_a
-
user_id_b
-
listen_date
-
overlap_song_count
Only include rows where overlap_song_count > 3 and the pair is not already in friendships.