Write SQL for noisy A/B launch metrics

Q: Write SQL for noisy A/B launch metrics

This question evaluates SQL-based data manipulation and experimental analytics skills, including attribution of earliest exposures, censoring crossovers, session-based bot qualification, time-windowed conversion and revenue calculations, and exclusion of outage periods in PostgreSQL.

Q: How do I practice SQL interview questions?

PracHub provides an interactive SQL console where you can write and test queries against real database schemas. Get instant feedback and compare your solution with the expected output.

Q: What difficulty level is this coding question?

This is a medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at Chime.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Chime during technical interviews.

Question

PostgreSQL. Today is 2025-09-01. You’re given the following schema and toy samples:

users(user_id INT, country TEXT, signup_date DATE, marketing_channel TEXT)
+---------+---------+-------------+------------------+
| user_id | country | signup_date | marketing_channel|
+---------+---------+-------------+------------------+
| 1 | US | 2025-08-12 | email |
| 2 | CA | 2025-08-13 | paid |
| 3 | US | 2025-08-11 | organic |
| 4 | GB | 2025-08-17 | paid |
+---------+---------+-------------+------------------+

exposures(user_id INT, experiment_id TEXT, variant TEXT, exposure_ts TIMESTAMP, source TEXT)
+---------+--------------------+---------+---------------------+--------+
| user_id | experiment_id | variant | exposure_ts | source |
+---------+--------------------+---------+---------------------+--------+
| 1 | tracker_vlaunch_v1 | control | 2025-08-12 10:00 | web |
| 2 | tracker_vlaunch_v1 | B | 2025-08-13 09:35 | email |
| 2 | tracker_vlaunch_v1 | control | 2025-08-14 10:18 | web | <- crossover
| 3 | tracker_vlaunch_v1 | B | 2025-08-11 08:02 | paid |
| 4 | tracker_vlaunch_v1 | B | 2025-08-17 15:10 | web |
+---------+--------------------+---------+---------------------+--------+

sessions(user_id INT, session_start TIMESTAMP, device TEXT, is_bot BOOLEAN)
+---------+---------------------+---------+--------+
| user_id | session_start | device | is_bot |
+---------+---------------------+---------+--------+
| 1 | 2025-08-12 10:00 | ios | false |
| 2 | 2025-08-13 09:30 | web | false |
| 2 | 2025-08-14 10:10 | web | true |
| 3 | 2025-08-11 08:00 | android | false |
| 4 | 2025-08-17 15:00 | web | false |
+---------+---------------------+---------+--------+

orders(order_id INT, user_id INT, order_ts TIMESTAMP, revenue NUMERIC(10,2), refunded BOOLEAN)
+----------+---------+---------------------+---------+----------+
| order_id | user_id | order_ts | revenue | refunded |
+----------+---------+---------------------+---------+----------+
| 501 | 1 | 2025-08-20 12:00 | 99.00 | false |
| 502 | 2 | 2025-08-28 11:05 | 59.00 | true |
| 503 | 3 | 2025-08-13 09:00 | 49.00 | false |
| 504 | 4 | 2025-08-30 13:00 | 129.00 | false |
+----------+---------+---------------------+---------+----------+

Task A — core metrics (CTEs allowed): For experiment_id = 'tracker_vlaunch_v1' and exposures between 2025-08-01 and 2025-08-24, write a single query that:
- Assigns each user to the earliest exposure variant and censors any later crossovers.
- Qualifies users as non-bot if they have at least one session with is_bot = false on or before their earliest exposure.
- Computes, per variant, by country in {US, CA} only: users_exposed, qualified_users, converters_14d (users with an order within 14 days of earliest exposure and order_ts < '2025-09-01'), conv_rate_14d, revenue_14d (sum of non-refunded revenue within the same 14-day window), refund_rate_14d (refunded orders / all orders in window).
- Excludes orders whose order_ts falls in the iOS outage window ['2025-08-10', '2025-08-12') if the user’s earliest qualifying device was ios.

Task B — SRM diagnostic: Write a second query that tests for sample ratio mismatch across variants among qualified users using a chi-square goodness-of-fit test vs. expected 50/50. Output observed_count_control, observed_count_B, expected_each, chi2_stat, and an approximate p_value (you may use built-in ln(), exp(), and power() but no UDFs).

Edge cases to handle: users with no sessions; users with only bot sessions; users with orders before exposure; multiple orders within 14 days; users outside US/CA; and the crossover illustrated for user_id=2.

PracHub · Accepted Answer

This question evaluates SQL-based data manipulation and experimental analytics skills, including attribution of earliest exposures, censoring crossovers, session-based bot qualification, time-windowed conversion and revenue calculations, and exclusion of outage periods in PostgreSQL.

Quick Overview

Quick Overview