Write SQL for 7-day WhatsApp call metrics
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
Today is fixed as 2025-09-01. Using PostgreSQL, write a single query that returns one row per UTC calendar date for the last 7 days inclusive of today (2025-08-26 through 2025-09-01), with columns: date_utc, calls_initiated, completion_rate, drop_rate, median_duration_seconds. Rules: (1) Only count calls where neither caller nor callee is an employee (users.is_employee=false); (2) A call’s date is based on calls.started_at (UTC), even if it ends next day; (3) completion_rate = completed / calls_initiated, drop_rate = dropped / calls_initiated; (4) Median duration is over completed calls only, computed as ended_at − started_at in whole seconds; exclude durations <= 0; (5) Trust calls.outcome over call_events if inconsistent; (6) Do not double-count due to duplicate events; (7) Return exactly 7 rows; (8) Use percentile_cont(0.5) for median. Schema and tiny samples below (UTC timestamps):
users(user_id INT PRIMARY KEY, is_employee BOOLEAN, country TEXT)
calls(call_id INT PRIMARY KEY, caller_id INT, callee_id INT, started_at TIMESTAMP, ended_at TIMESTAMP, outcome TEXT CHECK (outcome IN ('completed','dropped','failed')), network_type TEXT)
call_events(call_id INT, event_time TIMESTAMP, event_type TEXT)
Sample rows:
users
| user_id | is_employee | country |
| 1 | false | US |
| 2 | false | US |
| 3 | true | US |
| 4 | false | IN |
calls
| call_id | caller_id | callee_id | started_at | ended_at | outcome | network_type |
| 10 | 1 | 2 | 2025-08-26 10:00:00 | 2025-08-26 10:05:00 | completed | wifi |
| 11 | 1 | 3 | 2025-08-27 12:00:00 | 2025-08-27 12:01:30 | dropped | lte |
| 12 | 4 | 1 | 2025-09-01 21:59:00 | 2025-09-01 22:10:00 | completed | 3g |
| 13 | 2 | 4 | 2025-08-30 00:00:00 | 2025-08-30 00:00:00 | failed | unknown |
call_events
| call_id | event_time | event_type |
| 10 | 2025-08-26 10:00:00 | start |
| 10 | 2025-08-26 10:05:00 | end |
| 11 | 2025-08-27 12:00:00 | start |
| 11 | 2025-08-27 12:01:30 | drop |
| 11 | 2025-08-27 12:02:00 | reconnect |
| 12 | 2025-09-01 21:59:00 | start |
| 12 | 2025-09-01 22:10:00 | end |
| 13 | 2025-08-30 00:00:00 | start |
Be careful about: excluding employee-linked calls (either side), zero/negative durations, and ensuring dates with zero activity still appear with zeros/nulls.
Quick Answer: This question evaluates SQL-based data manipulation and aggregation skills, specifically time-based grouping, filtering by user attributes, deduplication, handling of edge-case durations, and computation of rates and medians from transactional call and event tables.