This question evaluates SQL proficiency—specifically joins, filtering, type casting, temporal aggregation, window functions or deduplication, and deterministic tie-breaking—by requiring derivation of first non-creator view times and per-creator collaborator counts from event tables.
You are given event data about users viewing design files.
Assume the following tables:
files(file_id, creator_id)
creator_id
is a VARCHAR user identifier.
file_views(file_id, user_id, viewed_at)
user_id
is an INTEGER user identifier.
user_id
viewed
file_id
at time
viewed_at
.
A “share” is inferred when someone other than the creator views a file.
Write a SQL query that returns, for each creator, the first time they ever shared a file they created (i.e., the earliest viewed_at where a non-creator viewed one of their files).
Output columns:
creator_id
first_shared_at
Notes:
creator_id
(VARCHAR) and
user_id
(INTEGER) are different types; handle casting appropriately.
For each file, define the “first person it was shared with” as the earliest non-creator viewer of that file (based on viewed_at).
Build a query/model that, for each creator, finds their closest collaborator, defined as the user who was the “first person shared with” for the creator’s files the most times.
Output columns:
creator_id
closest_collaborator_user_id
collaborator_value
(the number of files for which this collaborator was the first non-creator viewer)
Tie-breaking: if multiple collaborators have the same collaborator_value, choose the one with the earliest first-share timestamp across those tied records (and if still tied, the smallest user id).