Write SQL to analyze Group Calls adoption
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Write SQL (assume PostgreSQL) to analyze Group Calls adoption and cannibalization. Use this schema and sample data.
Schema:
- users(user_id INT PRIMARY KEY, signup_date DATE, country TEXT)
- experiments(experiment_name TEXT, user_id INT, variant TEXT CHECK (variant IN ('treatment','control')), exposure_date DATE)
- calls(call_id INT PRIMARY KEY, started_at TIMESTAMP, is_group_call BOOLEAN, creator_user_id INT)
- call_participants(call_id INT, user_id INT, joined_at TIMESTAMP, left_at TIMESTAMP)
- friendships(user_id INT, friend_user_id INT, since_date DATE)
Sample rows:
users
+---------+-------------+---------+
| user_id | signup_date | country |
+---------+-------------+---------+
| 1 | 2025-06-15 | US |
| 2 | 2025-07-20 | US |
| 3 | 2025-08-05 | CA |
| 4 | 2025-08-10 | US |
| 5 | 2025-08-12 | US |
+---------+-------------+---------+
experiments
+----------------+---------+----------+---------------+
| experiment_name| user_id | variant | exposure_date |
+----------------+---------+----------+---------------+
| grp_calls_v1 | 1 | treatment| 2025-08-18 |
| grp_calls_v1 | 2 | control | 2025-08-18 |
| grp_calls_v1 | 3 | treatment| 2025-08-19 |
| grp_calls_v1 | 4 | control | 2025-08-19 |
| grp_calls_v1 | 5 | treatment| 2025-08-20 |
+----------------+---------+----------+---------------+
calls
+---------+---------------------+---------------+------------------+
| call_id | started_at | is_group_call | creator_user_id |
+---------+---------------------+---------------+------------------+
| 100 | 2025-08-26 10:00:00 | true | 1 |
| 101 | 2025-08-26 10:05:00 | false | 2 |
| 102 | 2025-08-27 08:00:00 | true | 3 |
| 103 | 2025-08-27 09:00:00 | false | 4 |
+---------+---------------------+---------------+------------------+
call_participants
+---------+---------+---------------------+---------------------+
| call_id | user_id | joined_at | left_at |
+---------+---------+---------------------+---------------------+
| 100 | 1 | 2025-08-26 10:00:00 | 2025-08-26 10:30:00 |
| 100 | 2 | 2025-08-26 10:02:00 | 2025-08-26 10:10:00 |
| 100 | 5 | 2025-08-26 10:04:00 | 2025-08-26 10:20:00 |
| 101 | 2 | 2025-08-26 10:05:00 | 2025-08-26 10:25:00 |
| 102 | 3 | 2025-08-27 08:00:00 | 2025-08-27 08:25:00 |
| 102 | 4 | 2025-08-27 08:03:00 | 2025-08-27 08:20:00 |
| 103 | 4 | 2025-08-27 09:00:00 | 2025-08-27 09:10:00 |
+---------+---------+---------------------+---------------------+
friendships
+---------+----------------+------------+
| user_id | friend_user_id | since_date |
+---------+----------------+------------+
| 1 | 2 | 2025-07-01 |
| 2 | 5 | 2025-08-15 |
| 3 | 4 | 2025-08-12 |
+---------+----------------+------------+
Tasks (use week_of('2025-08-25'..'2025-08-31') as the experiment week; pre-period is '2025-08-18'..'2025-08-24'; only include users exposed on or before 2025-08-24): (1) Adoption: compute, by country, the share of treatment users who either started or joined at least one group call in the experiment week. (2) Cannibalization: compute a difference-in-differences estimate of per-user 1:1 call count (is_group_call=false) between treatment and control (experiment week minus pre-period). (3) Interference check: list top 5 call_ids in the experiment week where participants have mixed variants (both treatment and control present), with counts by variant. Provide SQL for each task and ensure no double-counting of users across calls.
Quick Answer: This question evaluates SQL-based data manipulation and experiment analysis skills, including cohort selection, time-window filtering, joins, aggregations, and difference-in-differences style comparisons using a PostgreSQL event schema.