Explain Joins and Write Coupon SQL
Company: eBay
Role: Data Analyst
Category: Data Manipulation (SQL/Python)
Difficulty: medium
Interview Round: HR Screen
You are interviewing for a data analyst internship. Answer the SQL fundamentals questions and then write SQL for a coupon campaign analysis.
### Part A: SQL fundamentals
1. Explain the logical execution order of a SQL query containing `SELECT`, aggregate functions, `FROM`, `JOIN`, `WHERE`, `GROUP BY`, `HAVING`, and `ORDER BY`.
2. Explain the difference between `INNER JOIN`, `LEFT JOIN`, and `RIGHT JOIN`, including how unmatched rows are handled.
### Part B: Coupon campaign SQL
Assume all timestamps are stored in UTC and should be analyzed in the `Asia/Shanghai` timezone. Only completed transactions count toward conversion and GMB.
#### Table: `user_coupon_assignments`
Each row represents one user assigned to a coupon campaign or control group.
| Column | Type | Description |
|---|---:|---|
| `user_id` | BIGINT | Unique user identifier |
| `campaign_id` | BIGINT | Coupon campaign identifier |
| `assignment_ts` | TIMESTAMP | Timestamp when the user was assigned to treatment or control |
| `coupon_sent_ts` | TIMESTAMP | Timestamp when the coupon was sent; `NULL` for control users |
| `experiment_group` | VARCHAR | Either `treatment` or `control` |
| `user_segment` | VARCHAR | User segment, such as new seller, active seller, or dormant seller |
Primary key: (`user_id`, `campaign_id`).
#### Table: `transactions`
Each row represents one transaction.
| Column | Type | Description |
|---|---:|---|
| `transaction_id` | BIGINT | Unique transaction identifier |
| `user_id` | BIGINT | Buyer or seller user identifier, matching `user_coupon_assignments.user_id` |
| `transaction_ts` | TIMESTAMP | Transaction creation timestamp |
| `order_status` | VARCHAR | Transaction status, such as `completed`, `cancelled`, or `refunded` |
| `gmb_amount` | DECIMAL(18,2) | Gross merchandise bought amount for the transaction |
Relationship: `transactions.user_id` joins to `user_coupon_assignments.user_id`.
### Required queries
1. For each treatment user who was sent a coupon, classify the user based on completed transactions in the 30 days after `coupon_sent_ts`:
- `active`: at least one completed transaction in the 30-day window.
- `inactive`: no completed transactions in the 30-day window.
Required output columns: `campaign_id`, `user_id`, `coupon_sent_date`, `transaction_count_30d`, `gmb_30d`, `activity_status`.
2. Calculate daily GMB by coupon send date for treatment users. GMB is the sum of `gmb_amount` from completed transactions in the 30 days after the coupon was sent.
Required output columns: `coupon_sent_date`, `sent_user_count`, `converted_user_count`, `transaction_count_30d`, `gmb_30d`. Sort by `coupon_sent_date` ascending.
3. Calculate the 30-day conversion rate separately for treatment and control groups. For treatment users, the 30-day window starts at `coupon_sent_ts`. For control users, the comparable 30-day window starts at `assignment_ts`. A converted user is a user with at least one completed transaction in the relevant 30-day window.
Required output columns: `experiment_group`, `assigned_user_count`, `converted_user_count`, `conversion_rate`.
Quick Answer: This question evaluates proficiency in SQL fundamentals—logical execution order of query clauses, join semantics (INNER/LEFT/RIGHT), timezone-aware timestamp handling, aggregation, grouping and filtering—alongside practical competency in cohort and campaign analytics such as conversion rate, transaction counts, and gross merchandise bought (GMB).