Write SQL for initiators and French DAU%
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
You are given the following PostgreSQL tables. Assume all timestamps are UTC and "today" is 2025-09-01. For any reference to "last 7 days," use the inclusive window 2025-08-26 00:00:00 to 2025-09-01 23:59:59. For any reference to "yesterday," use the date 2025-08-31.
Schema:
- users(user_id bigint primary key, country text)
- sessions(event_date date, user_id bigint) -- one or more rows per user per day are possible
- calls(call_id bigint primary key, initiator_id bigint, started_at timestamp, is_group_call boolean)
- call_participants(call_id bigint, user_id bigint) -- includes the initiator and all other participants
Small sample data:
users
+---------+---------+
| user_id | country |
+---------+---------+
| 1 | FR |
| 2 | FR |
| 3 | US |
| 4 | FR |
| 5 | IN |
sessions
+------------+---------+
| event_date | user_id |
+------------+---------+
| 2025-08-31 | 1 |
| 2025-08-31 | 2 |
| 2025-08-31 | 3 |
| 2025-09-01 | 1 |
| 2025-09-01 | 4 |
calls
+----------+--------------+---------------------+---------------+
| call_id | initiator_id | started_at | is_group_call |
+----------+--------------+---------------------+---------------+
| 10 | 1 | 2025-08-26 09:00:00 | false |
| 11 | 2 | 2025-08-31 21:00:00 | true |
| 12 | 3 | 2025-09-01 01:00:00 | false |
| 13 | 1 | 2025-08-20 12:00:00 | true |
| 14 | 4 | 2025-08-31 23:30:00 | true |
call_participants
+----------+---------+
| call_id | user_id |
+----------+---------+
| 10 | 1 |
| 10 | 3 |
| 11 | 2 |
| 11 | 4 |
| 11 | 5 |
| 12 | 3 |
| 12 | 1 |
| 14 | 2 |
| 14 | 4 |
Tasks:
Q1. Return the distinct users who initiated at least one video call (group or 1:1) in the last 7 days (2025-08-26 00:00:00 to 2025-09-01 23:59:59, inclusive). Output columns: user_id.
Q2. Compute the percentage of daily active users from France who were on a video call yesterday (2025-08-31). Denominator: distinct users with a sessions.event_date = '2025-08-31' and users.country = 'FR'. Numerator: among those French users, the distinct users who appear in call_participants for calls whose DATE(started_at) = '2025-08-31'. Output columns: ref_date (date), dau_france (int), on_call_france (int), pct_on_call_france (numeric, percentage rounded to 2 decimals). Treat multiple sessions per user correctly (dedupe), and include users who both initiated and participated only once in the numerator. Write a single SQL script with CTEs that answers both Q1 and Q2, and clearly states any timezone/local-date assumptions.
Quick Answer: This question evaluates SQL proficiency for time-based event analysis, covering joins, deduplication, CTEs, date/time and timezone handling, and aggregated metric calculation such as DAU and participation rates.