PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates SQL-based data manipulation and experimental analytics skills, including attribution of earliest exposures, censoring crossovers, session-based bot qualification, time-windowed conversion and revenue calculations, and exclusion of outage periods in PostgreSQL.

  • Medium
  • Chime
  • Data Manipulation (SQL/Python)
  • Data Scientist

Write SQL for noisy A/B launch metrics

Company: Chime

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

PostgreSQL. Today is 2025-09-01. You’re given the following schema and toy samples: users(user_id INT, country TEXT, signup_date DATE, marketing_channel TEXT) +---------+---------+-------------+------------------+ | user_id | country | signup_date | marketing_channel| +---------+---------+-------------+------------------+ | 1 | US | 2025-08-12 | email | | 2 | CA | 2025-08-13 | paid | | 3 | US | 2025-08-11 | organic | | 4 | GB | 2025-08-17 | paid | +---------+---------+-------------+------------------+ exposures(user_id INT, experiment_id TEXT, variant TEXT, exposure_ts TIMESTAMP, source TEXT) +---------+--------------------+---------+---------------------+--------+ | user_id | experiment_id | variant | exposure_ts | source | +---------+--------------------+---------+---------------------+--------+ | 1 | tracker_vlaunch_v1 | control | 2025-08-12 10:00 | web | | 2 | tracker_vlaunch_v1 | B | 2025-08-13 09:35 | email | | 2 | tracker_vlaunch_v1 | control | 2025-08-14 10:18 | web | <- crossover | 3 | tracker_vlaunch_v1 | B | 2025-08-11 08:02 | paid | | 4 | tracker_vlaunch_v1 | B | 2025-08-17 15:10 | web | +---------+--------------------+---------+---------------------+--------+ sessions(user_id INT, session_start TIMESTAMP, device TEXT, is_bot BOOLEAN) +---------+---------------------+---------+--------+ | user_id | session_start | device | is_bot | +---------+---------------------+---------+--------+ | 1 | 2025-08-12 10:00 | ios | false | | 2 | 2025-08-13 09:30 | web | false | | 2 | 2025-08-14 10:10 | web | true | | 3 | 2025-08-11 08:00 | android | false | | 4 | 2025-08-17 15:00 | web | false | +---------+---------------------+---------+--------+ orders(order_id INT, user_id INT, order_ts TIMESTAMP, revenue NUMERIC(10,2), refunded BOOLEAN) +----------+---------+---------------------+---------+----------+ | order_id | user_id | order_ts | revenue | refunded | +----------+---------+---------------------+---------+----------+ | 501 | 1 | 2025-08-20 12:00 | 99.00 | false | | 502 | 2 | 2025-08-28 11:05 | 59.00 | true | | 503 | 3 | 2025-08-13 09:00 | 49.00 | false | | 504 | 4 | 2025-08-30 13:00 | 129.00 | false | +----------+---------+---------------------+---------+----------+ Task A — core metrics (CTEs allowed): For experiment_id = 'tracker_vlaunch_v1' and exposures between 2025-08-01 and 2025-08-24, write a single query that: - Assigns each user to the earliest exposure variant and censors any later crossovers. - Qualifies users as non-bot if they have at least one session with is_bot = false on or before their earliest exposure. - Computes, per variant, by country in {US, CA} only: users_exposed, qualified_users, converters_14d (users with an order within 14 days of earliest exposure and order_ts < '2025-09-01'), conv_rate_14d, revenue_14d (sum of non-refunded revenue within the same 14-day window), refund_rate_14d (refunded orders / all orders in window). - Excludes orders whose order_ts falls in the iOS outage window ['2025-08-10', '2025-08-12') if the user’s earliest qualifying device was ios. Task B — SRM diagnostic: Write a second query that tests for sample ratio mismatch across variants among qualified users using a chi-square goodness-of-fit test vs. expected 50/50. Output observed_count_control, observed_count_B, expected_each, chi2_stat, and an approximate p_value (you may use built-in ln(), exp(), and power() but no UDFs). Edge cases to handle: users with no sessions; users with only bot sessions; users with orders before exposure; multiple orders within 14 days; users outside US/CA; and the crossover illustrated for user_id=2.

Quick Answer: This question evaluates SQL-based data manipulation and experimental analytics skills, including attribution of earliest exposures, censoring crossovers, session-based bot qualification, time-windowed conversion and revenue calculations, and exclusion of outage periods in PostgreSQL.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Compute early signals for user retention - Chime (easy)
  • Write rolling-window SQL over weekly cohorts - Chime (Medium)
  • Analyze Acquisition Channels for User Value and Retention - Chime (Medium)