PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Data Manipulation (SQL/Python)/Meta

Compute cohort GMV and payer rate with edge cases

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a data scientist's competency in cohort-based GMV calculation, payer-rate measurement, and handling common edge cases across events, orders, payments, and refunds.

  • Medium
  • Meta
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute cohort GMV and payer rate with edge cases

Company: Meta

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

You are given the following schema (timestamps are UTC): users(user_id INT, country STRING, created_at TIMESTAMP) events(user_id INT, event_ts TIMESTAMP, event_type STRING) -- only app_open rows are relevant orders(order_id INT, user_id INT, order_ts TIMESTAMP, status STRING) -- status in ('placed','canceled') payments(payment_id INT, order_id INT, amount DECIMAL(10,2), payment_ts TIMESTAMP, is_valid BOOL) refunds(refund_id INT, order_id INT, amount DECIMAL(10,2), refund_ts TIMESTAMP) Small sample data: users +---------+---------+---------------------+ | user_id | country | created_at | +---------+---------+---------------------+ | 1 | US | 2025-06-20 00:00:00 | | 2 | US | 2025-07-10 00:00:00 | | 3 | IN | 2025-08-02 00:00:00 | | 4 | US | 2025-08-25 00:00:00 | | 5 | US | 2025-07-28 00:00:00 | | 6 | BR | 2025-08-15 00:00:00 | +---------+---------+---------------------+ events (only app_open shown) +---------+---------------------+-----------+ | user_id | event_ts | event_type| +---------+---------------------+-----------+ | 1 | 2025-08-05 10:00:00 | app_open | | 2 | 2025-08-10 09:00:00 | app_open | | 2 | 2025-08-28 13:00:00 | app_open | | 3 | 2025-08-03 08:30:00 | app_open | | 4 | 2025-08-30 20:10:00 | app_open | | 5 | 2025-08-12 12:00:00 | app_open | | 6 | 2025-08-31 23:55:00 | app_open | +---------+---------------------+-----------+ orders +----------+---------+---------------------+----------+ | order_id | user_id | order_ts | status | +----------+---------+---------------------+----------+ | 101 | 1 | 2025-08-05 10:05:00 | placed | | 102 | 2 | 2025-08-10 09:05:00 | placed | | 103 | 2 | 2025-08-28 13:05:00 | canceled | | 104 | 3 | 2025-08-03 08:35:00 | placed | | 105 | 4 | 2025-08-30 20:15:00 | placed | | 106 | 5 | 2025-08-12 12:05:00 | placed | | 107 | 6 | 2025-08-31 23:58:00 | placed | +----------+---------+---------------------+----------+ payments +------------+----------+--------+---------------------+----------+ | payment_id | order_id | amount | payment_ts | is_valid | +------------+----------+--------+---------------------+----------+ | 201 | 101 | 50.00 | 2025-08-05 10:06:00 | 1 | | 202 | 102 | 20.00 | 2025-08-10 09:06:00 | 1 | | 203 | 103 | 15.00 | 2025-08-28 13:06:00 | 1 | | 204 | 103 | 15.00 | 2025-08-28 13:06:00 | 0 | -- duplicate/invalid | 205 | 104 | 30.00 | 2025-09-01 00:01:00 | 1 | -- late payment (Sep) | 206 | 105 | 100.00 | 2025-08-30 20:16:00 | 1 | | 207 | 106 | 40.00 | 2025-08-12 12:06:00 | 1 | | 208 | 107 | 25.00 | 2025-09-01 00:10:00 | 1 | -- late payment (Sep) +------------+----------+--------+---------------------+----------+ refunds +-----------+----------+--------+---------------------+ | refund_id | order_id | amount | refund_ts | +-----------+----------+--------+---------------------+ | 301 | 103 | 15.00 | 2025-08-29 10:00:00 | | 302 | 105 | 100.00 | 2025-09-02 09:00:00 | -- refund after month-end | 303 | 106 | 10.00 | 2025-08-20 12:00:00 | +-----------+----------+--------+---------------------+ Task (one Standard SQL query): For calendar month 2025-08, output one row per signup cohort month (cohort_month = DATE_TRUNC(created_at, MONTH)) with: cohort_month, active_users_aug, payers_aug, payer_rate_aug, gmv_aug_usd. Rules and edge cases to handle precisely: - Active users: distinct users in the cohort with at least one events.event_type = 'app_open' during 2025-08-01 to 2025-08-31 inclusive. - GMV (gmv_aug_usd): sum of payments.amount where payments.is_valid = 1 and payment_ts in 2025-08, minus sum of refunds.amount where refund_ts in 2025-08, regardless of order status or order_ts. Do not include invalid payments (is_valid=0). GMV may be negative. - Payers (payers_aug): among active users in the cohort, count distinct users whose net_august_amount = (valid August payments tied to their orders) minus (August refunds tied to their orders) is strictly > 0. A canceled order with payment and same-month full refund should not count as a payer. - Denominator zero: if active_users_aug = 0, return payer_rate_aug = NULL (not 0). - Ignore rows with NULL user_id anywhere. - Late/early timing: payments/refunds outside August must not affect August metrics, even if the related order_ts is in August. Return the exact SQL that produces the specified output.

Quick Answer: This question evaluates a data scientist's competency in cohort-based GMV calculation, payer-rate measurement, and handling common edge cases across events, orders, payments, and refunds.

Related Interview Questions

  • Compute ad impression conversion rates - Meta (medium)
  • Count unconnected posts and reactions - Meta (medium)
  • Count heavy callers in 7 days - Meta (medium)
  • Write SQL for call metrics - Meta (medium)
  • Write SQL for multi-account metrics - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
7
0

You are given the following schema (timestamps are UTC):

users(user_id INT, country STRING, created_at TIMESTAMP) events(user_id INT, event_ts TIMESTAMP, event_type STRING) -- only app_open rows are relevant orders(order_id INT, user_id INT, order_ts TIMESTAMP, status STRING) -- status in ('placed','canceled') payments(payment_id INT, order_id INT, amount DECIMAL(10,2), payment_ts TIMESTAMP, is_valid BOOL) refunds(refund_id INT, order_id INT, amount DECIMAL(10,2), refund_ts TIMESTAMP)

Small sample data:

users +---------+---------+---------------------+ | user_id | country | created_at | +---------+---------+---------------------+ | 1 | US | 2025-06-20 00:00:00 | | 2 | US | 2025-07-10 00:00:00 | | 3 | IN | 2025-08-02 00:00:00 | | 4 | US | 2025-08-25 00:00:00 | | 5 | US | 2025-07-28 00:00:00 | | 6 | BR | 2025-08-15 00:00:00 | +---------+---------+---------------------+

events (only app_open shown) +---------+---------------------+-----------+ | user_id | event_ts | event_type| +---------+---------------------+-----------+ | 1 | 2025-08-05 10:00:00 | app_open | | 2 | 2025-08-10 09:00:00 | app_open | | 2 | 2025-08-28 13:00:00 | app_open | | 3 | 2025-08-03 08:30:00 | app_open | | 4 | 2025-08-30 20:10:00 | app_open | | 5 | 2025-08-12 12:00:00 | app_open | | 6 | 2025-08-31 23:55:00 | app_open | +---------+---------------------+-----------+

orders +----------+---------+---------------------+----------+ | order_id | user_id | order_ts | status | +----------+---------+---------------------+----------+ | 101 | 1 | 2025-08-05 10:05:00 | placed | | 102 | 2 | 2025-08-10 09:05:00 | placed | | 103 | 2 | 2025-08-28 13:05:00 | canceled | | 104 | 3 | 2025-08-03 08:35:00 | placed | | 105 | 4 | 2025-08-30 20:15:00 | placed | | 106 | 5 | 2025-08-12 12:05:00 | placed | | 107 | 6 | 2025-08-31 23:58:00 | placed | +----------+---------+---------------------+----------+

payments +------------+----------+--------+---------------------+----------+ | payment_id | order_id | amount | payment_ts | is_valid | +------------+----------+--------+---------------------+----------+ | 201 | 101 | 50.00 | 2025-08-05 10:06:00 | 1 | | 202 | 102 | 20.00 | 2025-08-10 09:06:00 | 1 | | 203 | 103 | 15.00 | 2025-08-28 13:06:00 | 1 | | 204 | 103 | 15.00 | 2025-08-28 13:06:00 | 0 | -- duplicate/invalid | 205 | 104 | 30.00 | 2025-09-01 00:01:00 | 1 | -- late payment (Sep) | 206 | 105 | 100.00 | 2025-08-30 20:16:00 | 1 | | 207 | 106 | 40.00 | 2025-08-12 12:06:00 | 1 | | 208 | 107 | 25.00 | 2025-09-01 00:10:00 | 1 | -- late payment (Sep) +------------+----------+--------+---------------------+----------+

refunds +-----------+----------+--------+---------------------+ | refund_id | order_id | amount | refund_ts | +-----------+----------+--------+---------------------+ | 301 | 103 | 15.00 | 2025-08-29 10:00:00 | | 302 | 105 | 100.00 | 2025-09-02 09:00:00 | -- refund after month-end | 303 | 106 | 10.00 | 2025-08-20 12:00:00 | +-----------+----------+--------+---------------------+

Task (one Standard SQL query): For calendar month 2025-08, output one row per signup cohort month (cohort_month = DATE_TRUNC(created_at, MONTH)) with: cohort_month, active_users_aug, payers_aug, payer_rate_aug, gmv_aug_usd. Rules and edge cases to handle precisely:

  • Active users: distinct users in the cohort with at least one events.event_type = 'app_open' during 2025-08-01 to 2025-08-31 inclusive.
  • GMV (gmv_aug_usd): sum of payments.amount where payments.is_valid = 1 and payment_ts in 2025-08, minus sum of refunds.amount where refund_ts in 2025-08, regardless of order status or order_ts. Do not include invalid payments (is_valid=0). GMV may be negative.
  • Payers (payers_aug): among active users in the cohort, count distinct users whose net_august_amount = (valid August payments tied to their orders) minus (August refunds tied to their orders) is strictly > 0. A canceled order with payment and same-month full refund should not count as a payer.
  • Denominator zero: if active_users_aug = 0, return payer_rate_aug = NULL (not 0).
  • Ignore rows with NULL user_id anywhere.
  • Late/early timing: payments/refunds outside August must not affect August metrics, even if the related order_ts is in August. Return the exact SQL that produces the specified output.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Meta•More Data Scientist•Meta Data Scientist•Meta Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.