Compute 7-day views and reactions by relationship
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Use the schemas and sample data below to answer two tasks. Assume dates are strings in 'YYYY-MM-DD'. Treat "today" as 2025-09-01; "last/past 7 days" means ds between 2025-08-26 and 2025-09-01 inclusive. If a viewer has multiple views of the same post on the same day, deduplicate by keeping the row with the maximum duration.
Schemas:
- info_stream_views(post_id BIGINT, viewer_id BIGINT, relationship STRING CHECK (relationship IN ('Friend','Followee','Unconnected')), duration DOUBLE, ds STRING)
- post_reactions(post_id INTEGER, viewer_id INTEGER, post_action STRING CHECK (post_action IN ('like','comment','reshare')), ds STRING)
ASCII sample tables:
info_stream_views
post_id | viewer_id | relationship | duration | ds
101 | 10 | Friend | 45 | 2025-08-26
101 | 10 | Friend | 75 | 2025-08-27
101 | 11 | Unconnected | 61 | 2025-08-28
102 | 12 | Unconnected | 30 | 2025-08-28
103 | 13 | Followee | 90 | 2025-08-29
104 | 14 | Unconnected | 120 | 2025-08-30
104 | 14 | Unconnected | 20 | 2025-08-30
105 | 10 | Unconnected | 59 | 2025-09-01
post_reactions
post_id | viewer_id | post_action | ds
101 | 10 | like | 2025-08-27
101 | 11 | comment | 2025-08-28
104 | 14 | reshare | 2025-08-30
104 | 14 | like | 2025-08-30
103 | 13 | like | 2025-08-29
105 | 10 | like | 2025-09-01
Tasks:
Q1) Write a single SQL query that returns the count of DISTINCT post_id that had at least one Unconnected view with duration > 60 seconds within 2025-08-26..2025-09-01. Ensure a post is counted once even if multiple qualifying viewers/days exist.
Q2) For the same 7-day window, compute two numbers: avg_reactions_per_post for Friend and for Unconnected. Define reactions-to-relationship by joining reactions to views on (post_id, viewer_id, ds) and, if multiple same-day views exist, attribute the reaction to the view with the longest duration. For each relationship group G in {'Friend','Unconnected'}: numerator = total count of reactions made by viewers labeled G; denominator = count of DISTINCT post_id that had at least one view by viewers labeled G in the window (include posts with zero reactions in the denominator). Return a two-row result set: relationship, avg_reactions_per_post (as a decimal). Use window functions or CTEs to handle deduping and joins, and be robust to posts that appear in both groups.
Quick Answer: This question evaluates skills in SQL-based data manipulation—specifically deduplication by longest duration, join-based attribution of reactions to views, window functions, distinct counts, aggregations, and date-range filtering within the Data Manipulation (SQL/Python) domain.