PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Data Manipulation (SQL/Python)/Airbnb

Compute C/T metrics from bookings and visits

Last updated: Mar 29, 2026

Quick Overview

This question evaluates time-window attribution, visit deduplication, control vs treatment metric computation, aggregation of conversions and revenue, and statistical interval estimation (Wilson CIs), testing practical data manipulation skills with SQL or Python in the Data Manipulation (SQL/Python) domain.

  • Medium
  • Airbnb
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute C/T metrics from bookings and visits

Company: Airbnb

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Given two tables, compute control vs treatment (C/T) metrics, apply 24‑hour attribution, and generate a daily plot. Treat “today” as 2025‑09‑01; use the last 7 days window 2025‑08‑26 through 2025‑09‑01 (inclusive). Provide either SQL or Python, and explain any assumptions. Schema: - visits(visit_id STRING, user_id STRING, visit_ts TIMESTAMP UTC, market_id STRING, channel STRING, variant CHAR(1) in {"C","T"}) - bookings(booking_id STRING, user_id STRING, booking_ts TIMESTAMP UTC, status STRING in {"confirmed","cancelled"}, revenue_usd DECIMAL(10,2)) Attribution rule: attribute each booking to the most recent visit by the same user within the prior 24 hours; otherwise leave unattributed. Deduplicate visits within 60 seconds per user_id by keeping only the latest. Exclude cancelled bookings from conversion and revenue; keep a diagnostic count of cancellations. Ignore bookings without an attributed visit for conversion, but report their count separately. Required outputs for 2025‑08‑26..2025‑09‑01 by variant and by day: - visits, unique_users, attributed_bookings (confirmed), conversion_rate = attributed_bookings / visits, total_revenue_usd (confirmed only), revenue_per_visit, cancellations, unattributed_confirmed_bookings. - A line plot of daily conversion_rate with 95% Wilson CIs for C and T. Edge cases to handle: a user with C and T visits in the window; multiple bookings after one visit; booking outside the 24‑hour window; duplicate visits within 60 seconds. Small sample data (ASCII) to clarify shapes (not exhaustive): visits +----------+---------+---------------------+-----------+---------+---------+ | visit_id | user_id | visit_ts | market_id | channel | variant | +----------+---------+---------------------+-----------+---------+---------+ | v1 | u101 | 2025-08-26 09:05:00 | NYC | seo | C | | v2 | u101 | 2025-08-26 20:10:00 | NYC | direct | T | | v3 | u102 | 2025-08-27 10:00:00 | SFO | paid | C | | v4 | u103 | 2025-08-27 23:50:00 | NYC | email | T | | v5 | u104 | 2025-08-28 00:10:00 | NYC | seo | T | | v6 | u105 | 2025-08-28 12:40:00 | BOS | seo | C | | v7 | u106 | 2025-08-30 21:55:00 | NYC | direct | T | | v8 | u106 | 2025-08-30 21:56:00 | NYC | direct | T | | v9 | u107 | 2025-09-01 09:30:00 | BOS | paid | C | +----------+---------+---------------------+-----------+---------+---------+ bookings +-------------+---------+---------------------+-----------+-------------+ | booking_id | user_id | booking_ts | status | revenue_usd | +-------------+---------+---------------------+-----------+-------------+ | b1 | u101 | 2025-08-26 21:00:00 | confirmed | 220.00 | | b2 | u102 | 2025-08-27 12:00:00 | confirmed | 150.00 | | b3 | u103 | 2025-08-28 00:30:00 | confirmed | 300.00 | | b4 | u104 | 2025-08-29 04:00:00 | confirmed | 180.00 | | b5 | u106 | 2025-08-31 22:10:00 | confirmed | 250.00 | | b6 | u105 | 2025-08-30 13:00:00 | cancelled | 200.00 | | b7 | u108 | 2025-08-29 14:00:00 | confirmed | 120.00 | +-------------+---------+---------------------+-----------+-------------+ Tasks: A) Recompute attribution from visits using the 24‑hour rule and deduped visits; produce the required metrics by variant and by day. B) Output a tidy table with one row per (date, variant) and the metrics above. C) Produce the described plot.

Quick Answer: This question evaluates time-window attribution, visit deduplication, control vs treatment metric computation, aggregation of conversions and revenue, and statistical interval estimation (Wilson CIs), testing practical data manipulation skills with SQL or Python in the Data Manipulation (SQL/Python) domain.

Related Interview Questions

  • Compute browsing metrics in Python from logs - Airbnb (Medium)
  • Build panel in SQL; run causal regression - Airbnb (Medium)
  • Review a geospatial Python module - Airbnb (Medium)
  • Aggregate User Activity, Fit Regression, Interpret Coefficients - Airbnb (Medium)
Airbnb logo
Airbnb
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
10
0

Given two tables, compute control vs treatment (C/T) metrics, apply 24‑hour attribution, and generate a daily plot. Treat “today” as 2025‑09‑01; use the last 7 days window 2025‑08‑26 through 2025‑09‑01 (inclusive). Provide either SQL or Python, and explain any assumptions. Schema:

  • visits(visit_id STRING, user_id STRING, visit_ts TIMESTAMP UTC, market_id STRING, channel STRING, variant CHAR(1) in {"C","T"})
  • bookings(booking_id STRING, user_id STRING, booking_ts TIMESTAMP UTC, status STRING in {"confirmed","cancelled"}, revenue_usd DECIMAL(10,2)) Attribution rule: attribute each booking to the most recent visit by the same user within the prior 24 hours; otherwise leave unattributed. Deduplicate visits within 60 seconds per user_id by keeping only the latest. Exclude cancelled bookings from conversion and revenue; keep a diagnostic count of cancellations. Ignore bookings without an attributed visit for conversion, but report their count separately. Required outputs for 2025‑08‑26..2025‑09‑01 by variant and by day:
  • visits, unique_users, attributed_bookings (confirmed), conversion_rate = attributed_bookings / visits, total_revenue_usd (confirmed only), revenue_per_visit, cancellations, unattributed_confirmed_bookings.
  • A line plot of daily conversion_rate with 95% Wilson CIs for C and T. Edge cases to handle: a user with C and T visits in the window; multiple bookings after one visit; booking outside the 24‑hour window; duplicate visits within 60 seconds. Small sample data (ASCII) to clarify shapes (not exhaustive): visits +----------+---------+---------------------+-----------+---------+---------+ | visit_id | user_id | visit_ts | market_id | channel | variant | +----------+---------+---------------------+-----------+---------+---------+ | v1 | u101 | 2025-08-26 09:05:00 | NYC | seo | C | | v2 | u101 | 2025-08-26 20:10:00 | NYC | direct | T | | v3 | u102 | 2025-08-27 10:00:00 | SFO | paid | C | | v4 | u103 | 2025-08-27 23:50:00 | NYC | email | T | | v5 | u104 | 2025-08-28 00:10:00 | NYC | seo | T | | v6 | u105 | 2025-08-28 12:40:00 | BOS | seo | C | | v7 | u106 | 2025-08-30 21:55:00 | NYC | direct | T | | v8 | u106 | 2025-08-30 21:56:00 | NYC | direct | T | | v9 | u107 | 2025-09-01 09:30:00 | BOS | paid | C | +----------+---------+---------------------+-----------+---------+---------+ bookings +-------------+---------+---------------------+-----------+-------------+ | booking_id | user_id | booking_ts | status | revenue_usd | +-------------+---------+---------------------+-----------+-------------+ | b1 | u101 | 2025-08-26 21:00:00 | confirmed | 220.00 | | b2 | u102 | 2025-08-27 12:00:00 | confirmed | 150.00 | | b3 | u103 | 2025-08-28 00:30:00 | confirmed | 300.00 | | b4 | u104 | 2025-08-29 04:00:00 | confirmed | 180.00 | | b5 | u106 | 2025-08-31 22:10:00 | confirmed | 250.00 | | b6 | u105 | 2025-08-30 13:00:00 | cancelled | 200.00 | | b7 | u108 | 2025-08-29 14:00:00 | confirmed | 120.00 | +-------------+---------+---------------------+-----------+-------------+ Tasks: A) Recompute attribution from visits using the 24‑hour rule and deduped visits; produce the required metrics by variant and by day. B) Output a tidy table with one row per (date, variant) and the metrics above. C) Produce the described plot.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Airbnb•More Data Scientist•Airbnb Data Scientist•Airbnb Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.