Compute percent of first-cancelled users who never rebook
Company: Pinterest
Role: Data Analyst
Category: Data Manipulation (SQL/Python)
Difficulty: medium
Interview Round: Technical Screen
You are interviewing for a health-tech product analytics role.
Assume the following table contains one row per appointment with its final status.
**Table: `appointments`**
- `appointment_id` (BIGINT, primary key)
- `user_id` (BIGINT, not null)
- `scheduled_start_at` (TIMESTAMP, not null) — when the appointment was scheduled to start
- `status` (VARCHAR, not null) — one of: `'cancelled'`, `'returned'`, `'confirmed'`, `'completed'`, `'no_show'`
- `status_updated_at` (TIMESTAMP, not null) — when the final `status` was set
**Definitions / assumptions**
- All timestamps are in UTC.
- A user can have multiple appointments.
- A user’s **first appointment** is the appointment with the earliest `scheduled_start_at` (break ties by smallest `appointment_id`).
- A **first-cancelled user** is a user whose first appointment has `status = 'cancelled'`.
- A user is considered to **have another future returned/confirmed appointment** if they have any appointment with `scheduled_start_at` strictly after their first appointment’s `scheduled_start_at` and with `status IN ('returned','confirmed')`.
**Task**
Write a SQL query to compute the **percentage of first-cancelled users who never have another future returned/confirmed appointment**.
**Required output (single row)**
- `first_cancelled_users` (integer)
- `never_returned_confirmed_users` (integer)
- `pct_never_returned_confirmed` (decimal between 0 and 1)
(You may use CTEs and window functions.)
Quick Answer: This question evaluates proficiency in SQL data manipulation—specifically use of window functions, CTEs, deduplication by earliest timestamp, temporal comparisons, and cohort-level aggregation—within the Data Manipulation (SQL/Python) domain for a Data Analyst role in health‑tech product analytics.