Compute Heavy-Caller Percentages
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: hard
Interview Round: Technical Screen
You are given two tables that track voice calls and daily active users for a messaging app.
**Table: `call_events`**
- `call_id` BIGINT — unique call identifier
- `event_time` TIMESTAMP — time when the call started, stored in UTC
- `caller_user_id` BIGINT — user who initiated the call
- `recipient_user_id` BIGINT — user who received the call
- `country_code` STRING — country of the call session, using ISO-style country codes such as `GB`
**Table: `daily_active_users`**
- `activity_date` DATE — date of user activity in UTC
- `user_id` BIGINT — active user identifier
- `country_code` STRING — user country on that date
Assume:
- The analysis date is the most recent date available in `daily_active_users`.
- The phrase **"last 7 days"** means the 7-day window ending on the analysis date, inclusive.
- For call-volume calculations, each row in `call_events` represents one completed call attempt.
- For question 2, a user's total call count includes both roles: calls made as caller and calls received as recipient.
- `GB` refers to users in Great Britain.
Write SQL for the following two tasks:
1. Compute the **percentage of distinct callers** who made **at least 20 calls** in the last 7 days.
- A "caller" is defined using `caller_user_id` only.
- Output columns:
- `analysis_date` DATE
- `pct_callers_ge_20` DOUBLE
2. Compute the **percentage of GB daily active users** on the analysis date who had **at least 50 total calls** in the last 7 days.
- Total calls = calls as `caller_user_id` + calls as `recipient_user_id`.
- The denominator should be GB users who are active on the analysis date in `daily_active_users`.
- Output columns:
- `analysis_date` DATE
- `pct_gb_dau_ge_50_calls` DOUBLE
Quick Answer: This question evaluates proficiency in time-windowed data aggregation, user-level counting and percentage calculations using SQL or Python for analytics tasks.