Write SQL for daily chats and fast replies
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
You are given a messaging events table that records one row per message sent.
Schema
- messages(
date DATE, -- calendar date of event (UTC)
ts TIMESTAMP, -- event timestamp (UTC)
sender_id BIGINT,
receiver_id BIGINT
)
Sample data (ASCII)
messages
+------------+---------------------+-----------+-------------+
| date | ts | sender_id | receiver_id |
+------------+---------------------+-----------+-------------+
| 2025-08-31 | 2025-08-31 23:59:50 | 101 | 202 |
| 2025-09-01 | 2025-09-01 00:00:30 | 202 | 101 |
| 2025-09-01 | 2025-09-01 09:00:00 | 101 | 303 |
| 2025-09-01 | 2025-09-01 09:00:45 | 303 | 101 |
| 2025-09-01 | 2025-09-01 10:15:00 | 101 | 404 |
| 2025-09-01 | 2025-09-01 11:00:00 | 101 | 505 |
| 2025-09-01 | 2025-09-01 11:05:00 | 101 | 606 |
| 2025-09-01 | 2025-09-01 12:00:00 | 101 | 707 |
| 2025-09-01 | 2025-09-01 12:01:10 | 707 | 101 |
| 2025-09-01 | 2025-09-01 13:00:00 | 808 | 101 |
| 2025-09-01 | 2025-09-01 13:00:20 | 101 | 808 |
+------------+---------------------+-----------+-------------+
Assume "today" is 2025-09-01.
Q1. Return all user_id who chatted with >5 distinct other users on 2025-09-01. Define "chatted with" as exchanging at least one message in either direction on that date. Output: user_id, partner_cnt. Requirements:
- Count distinct counterparties per user considering both sent and received messages.
- Be robust to multiple messages between the same pair and to users both sending and receiving with the same counterparty.
- Avoid double-counting a counterparty.
- Single SQL query preferred; standard SQL (window functions allowed).
Q2. Return distinct sender_id who received at least one reply within 60 seconds for a message they sent on 2025-09-01. Define a reply as a message from the original receiver back to the original sender with reply.ts - sent.ts between 0 and 60 seconds inclusive, and there is no other message between these two users that occurs after the sent message and before the reply. Output: sender_id and optionally count of such fast replies. Requirements:
- Use the exact timestamps (ts), not just the date column.
- Ensure directionality (receiver must reply to sender).
- Correctly handle multiple candidate replies—only the earliest opposing-direction message after each sent message can qualify.
- Edge cases: cross-midnight boundaries, identical timestamps, duplicate rows. Briefly state how your query handles these.
Quick Answer: This question evaluates a candidate's ability to manipulate relational data and reason about temporal event sequences, specifically testing skills in SQL aggregation, distinct counting, deduplication, bidirectional relationship handling, and time-windowed event matching.