This question evaluates proficiency with SQL window functions, including running and cumulative sums, LAG/forward-fill event logic, ranking functions (RANK/DENSE_RANK/ROW_NUMBER), tie-breaking strategies, and temporal date arithmetic for payments and subscription events.
You must use only window functions (no JOINs). CTEs are allowed. Given the schemas and tiny samples below, write SQL for each sub-question and explain any tie-breaking. Schema 1: payments(loan_id INT, payment_date DATE, payment_amount DECIMAL(10,2), loan_amount DECIMAL(10,2)) Sample: +---------+--------------+----------------+-------------+ | loan_id | payment_date | payment_amount | loan_amount | +---------+--------------+----------------+-------------+ | 1 | 2025-08-01 | 200.00 | 1000.00 | | 1 | 2025-08-15 | 300.00 | 1000.00 | | 1 | 2025-09-01 | 500.00 | 1000.00 | | 2 | 2025-08-02 | 100.00 | 600.00 | | 2 | 2025-09-01 | 100.00 | 600.00 | | 2 | 2025-09-05 | 200.00 | 600.00 | | 2 | 2025-09-12 | 200.00 | 600.00 | +---------+--------------+----------------+-------------+ Schema 2: subscription_events(account_id INT, event_date DATE, event_type STRING, mrr INT) where event_type ∈ ('start','upgrade','downgrade','churn','reactivate') Sample: +------------+------------+------------+-----+ | account_id | event_date | event_type | mrr | +------------+------------+------------+-----+ | 101 | 2025-07-01 | start | 50 | | 101 | 2025-08-10 | upgrade | 100 | | 101 | 2025-08-25 | churn | 0 | | 101 | 2025-08-28 | reactivate | 80 | | 102 | 2025-06-15 | start | 30 | | 102 | 2025-08-31 | churn | 0 | | 103 | 2025-08-20 | start | 20 | +------------+------------+------------+-----+ Tasks: (a) payments: For each loan_id, return the first payment_date at which the running sum of payment_amount reaches or exceeds 80% of that loan's principal, i.e., 0.8*MAX(loan_amount) OVER (PARTITION BY loan_id). Output: loan_id, threshold_date. Break ties by earliest payment_date, then smallest ROW_NUMBER over (loan_id ORDER BY payment_date, payment_amount DESC). (b) payments: Detect gaps longer than 1 calendar month between consecutive payments per loan. Output: loan_id, gap_start_date, gap_end_date, gap_months (integer months between dates). Hint: use LAG(payment_date) OVER (PARTITION BY loan_id ORDER BY payment_date). (c) payments: For date = '2025-09-01', compute each loan_id's cumulative paid-through amount up to and including that date, then within that date rank loans by cumulative amount using RANK(), DENSE_RANK(), and ROW_NUMBER() to illustrate differences when two loans tie. Output: loan_id, cum_paid_through_2025_09_01, rank_r, rank_dense, rownum. (d) subscription_events: As of today (2025-09-01), compute each account's current MRR, lifecycle_status ('active','churned','reactivated_last_7_days'), and the date of the last status-changing event. A reactivated account is one with a 'reactivate' in [2025-08-25, 2025-09-01]. Use only window functions to forward-fill the latest event ≤ 2025-09-01 and derive current MRR.