Write SQL/pandas for KPI anomaly

Q: How do I practice SQL interview questions?

PracHub provides an interactive SQL console where you can write and test queries against real database schemas. Get instant feedback and compare your solution with the expected output.

Q: What difficulty level is this coding question?

This is a medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Onsite rounds at Meta.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Question

Write SQL (and outline equivalent pandas) for a KPI anomaly investigation. Assume today = '2025-09-01'.
Schema:
Users(user_id INT, country TEXT, signup_date DATE)
Posts(post_id INT, user_id INT, post_type TEXT /* 'friend','page','event' */, created_at DATE)
Likes(like_id INT, user_id INT, post_id INT, created_at DATE)
FriendRequests(sender_id INT, receiver_id INT, created_at DATE, status TEXT /* 'sent','accepted','declined' */)
DailyActiveUsers(date DATE, country TEXT, dau INT)
Events(date DATE, outage_flag INT)
Sample rows:
Users
user_id | country | signup_date
1 | US | 2025-07-15
2 | US | 2025-08-20
3 | IN | 2025-08-01
4 | BR | 2025-07-01
5 | US | 2025-08-30
Posts
post_id | user_id | post_type | created_at
10 | 1 | friend | 2025-08-31
11 | 2 | page | 2025-08-31
12 | 3 | friend | 2025-09-01
13 | 4 | event | 2025-09-01
14 | 2 | friend | 2025-09-01
Likes
like_id | user_id | post_id | created_at
100 | 1 | 12 | 2025-09-01
101 | 2 | 10 | 2025-09-01
102 | 3 | 14 | 2025-09-01
103 | 4 | 10 | 2025-08-25
104 | 5 | 12 | 2025-09-01
FriendRequests
sender_id | receiver_id | created_at | status
1 | 2 | 2025-08-20 | accepted
2 | 3 | 2025-08-28 | sent
3 | 4 | 2025-08-29 | accepted
2 | 5 | 2025-08-31 | accepted
5 | 1 | 2025-09-01 | sent
DailyActiveUsers
date | country | dau
2025-08-25 | US | 3
2025-08-25 | IN | 1
2025-08-25 | BR | 1
2025-09-01 | US | 3
2025-09-01 | IN | 1
2025-09-01 | BR | 1
Events
date | outage_flag
2025-09-01 | 0
Tasks:
(a) For each country and post_type on 2025-09-01, compute Likes-per-DAU and its percent change versus the median of the same weekday over the previous 8 weeks, excluding dates where outage_flag=1. Use window functions (e.g., PERCENTILE_CONT) and ensure users are counted once per day for DAU via DailyActiveUsers.
(b) Flag the top 3 countries with the largest declines (≤ −10%) and, for each, attribute the decline between new users (signup_date ≥ '2025-08-02') and existing users. Return country, post_type, pct_change, share_of_decline_from_new_users.
(c) For those flagged countries, compute the Friend Request acceptance rate in the last 14 days (2025-08-19 to 2025-09-01) and compare with the prior 14 days. Output the absolute and relative change, using window functions rather than correlated subqueries.
(d) Provide a high-level pandas approach (groupby, merge, rolling/expanding, quantile) mirroring your SQL.
Edge cases to handle explicitly: users with multiple Likes on the same day, countries with sparse DAU, missing DAU rows, and time zones (assume UTC).

PracHub · Accepted Answer

This question evaluates proficiency with SQL window functions and equivalent pandas operations for KPI anomaly detection, covering baseline computation using same-weekday medians over eight weeks, per-country and per-post_type Likes-per-DAU metrics with deduplication of users for DAU, outage filtering, attribution of declines between new and existing users, and time-series comparisons like friend-request acceptance rates while explicitly handling duplicates, sparse or missing DAU rows, and timezone issues. It is commonly asked in Data Manipulation (SQL/Python) interviews to assess practical data engineering and analytical skills, primarily testing practical application with an underlying need for conceptual understanding of baselining, attribution, and data-quality edge cases.

Quick Overview

Quick Overview