PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates proficiency in SQL joins, aggregation, and window functions for ranked group-level analytics, specifically testing the ability to compute partitioned ranks, handle ties, and apply time-based filters in a data engineering context.

  • medium
  • Capital One
  • Coding & Algorithms
  • Data Engineer

Write SQL using joins and window functions

Company: Capital One

Role: Data Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

## SQL Coding: Rank within Groups with Joins You are given two tables: ### `accounts` - `account_id` (string) - `customer_id` (string) - `segment` (string) — e.g., `"prime"`, `"subprime"` ### `transactions` - `transaction_id` (string) - `account_id` (string) - `amount` (numeric) - `txn_ts` (timestamp) ### Task For each `segment`, find the **top 3 customers by total transaction amount** in the last **30 days** (inclusive). Requirements: - A customer may have **multiple accounts**; include all their accounts’ transactions. - Use appropriate **JOINs** to connect transactions to segments. - Use a **window function** to compute the rank **within each segment**. - If there are ties at rank 3, include **all** tied customers. ### Output Return columns: - `segment` - `customer_id` - `total_amount_30d` - `rank_in_segment` Order results by `segment`, then `rank_in_segment`, then `customer_id`.

Quick Answer: This question evaluates proficiency in SQL joins, aggregation, and window functions for ranked group-level analytics, specifically testing the ability to compute partitioned ranks, handle ties, and apply time-based filters in a data engineering context.

You are given two in-memory tables: accounts and transactions. Each account belongs to a customer and a segment. Each transaction belongs to an account. For each segment, find the customers with the top 3 SQL-style RANK values by total transaction amount in the last 30 days, inclusive of both the cutoff timestamp and current timestamp. A customer may have multiple accounts, and all qualifying transactions from all of their accounts in the same segment must be included. Only transactions whose account_id exists in accounts should be considered. Return results ordered by segment, then rank_in_segment, then customer_id. Ranking must behave like SQL RANK(): tied totals receive the same rank, and the next rank skips accordingly. Include all customers whose rank is at most 3, including ties.

Constraints

  • 0 <= len(accounts) <= 100000
  • 0 <= len(transactions) <= 100000
  • account_id values in accounts are unique
  • txn_ts and current_ts are ISO timestamp strings without time zones
  • amount is numeric and may be positive, zero, or negative
  • Only customers with at least one qualifying transaction in the last 30 days are returned

Examples

Input: ([{'account_id': 'a1', 'customer_id': 'c1', 'segment': 'prime'}, {'account_id': 'a2', 'customer_id': 'c1', 'segment': 'prime'}, {'account_id': 'a3', 'customer_id': 'c2', 'segment': 'prime'}, {'account_id': 'a4', 'customer_id': 'c3', 'segment': 'prime'}, {'account_id': 'a5', 'customer_id': 'c4', 'segment': 'prime'}, {'account_id': 'a6', 'customer_id': 'c5', 'segment': 'subprime'}], [{'transaction_id': 't1', 'account_id': 'a1', 'amount': 100, 'txn_ts': '2025-01-10 00:00:00'}, {'transaction_id': 't2', 'account_id': 'a2', 'amount': 50, 'txn_ts': '2025-01-20 00:00:00'}, {'transaction_id': 't3', 'account_id': 'a3', 'amount': 200, 'txn_ts': '2025-01-15 00:00:00'}, {'transaction_id': 't4', 'account_id': 'a4', 'amount': 75, 'txn_ts': '2025-01-30 00:00:00'}, {'transaction_id': 't5', 'account_id': 'a4', 'amount': 25, 'txn_ts': '2024-12-31 23:59:59'}, {'transaction_id': 't6', 'account_id': 'a5', 'amount': 75, 'txn_ts': '2025-01-05 00:00:00'}, {'transaction_id': 't7', 'account_id': 'a6', 'amount': 300, 'txn_ts': '2025-01-02 00:00:00'}], '2025-01-31 00:00:00')

Expected Output: [('prime', 'c2', 200, 1), ('prime', 'c1', 150, 2), ('prime', 'c3', 75, 3), ('prime', 'c4', 75, 3), ('subprime', 'c5', 300, 1)]

Explanation: Customer c1 has two prime accounts totaling 150. Customers c3 and c4 tie at rank 3 in prime, so both are included. The old transaction on 2024-12-31 is outside the 30-day window.

Input: ([{'account_id': 'a1', 'customer_id': 'c1', 'segment': 'prime'}, {'account_id': 'a2', 'customer_id': 'c2', 'segment': 'prime'}, {'account_id': 'a3', 'customer_id': 'c3', 'segment': 'prime'}, {'account_id': 'a4', 'customer_id': 'c4', 'segment': 'prime'}], [{'transaction_id': 't1', 'account_id': 'a1', 'amount': 10, 'txn_ts': '2025-01-02 12:00:00'}, {'transaction_id': 't2', 'account_id': 'a2', 'amount': 20, 'txn_ts': '2025-02-01 12:00:00'}, {'transaction_id': 't3', 'account_id': 'a3', 'amount': 100, 'txn_ts': '2025-01-02 11:59:59'}, {'transaction_id': 't4', 'account_id': 'a4', 'amount': 1000, 'txn_ts': '2025-02-01 12:00:01'}, {'transaction_id': 't5', 'account_id': 'missing', 'amount': 999, 'txn_ts': '2025-02-01 12:00:00'}], '2025-02-01 12:00:00')

Expected Output: [('prime', 'c2', 20, 1), ('prime', 'c1', 10, 2)]

Explanation: The transaction exactly 30 days before current_ts and the transaction exactly at current_ts are included. The older transaction, future transaction, and transaction with no matching account are ignored.

Input: ([{'account_id': 'a1', 'customer_id': 'c1', 'segment': 'prime'}], [{'transaction_id': 't1', 'account_id': 'a1', 'amount': 50, 'txn_ts': '2024-12-01 00:00:00'}], '2025-01-31 00:00:00')

Expected Output: []

Explanation: There are no transactions within the inclusive 30-day window, so no customers are returned.

Input: ([{'account_id': 'a1', 'customer_id': 'c1', 'segment': 'prime'}, {'account_id': 'a2', 'customer_id': 'c2', 'segment': 'prime'}, {'account_id': 'a3', 'customer_id': 'c3', 'segment': 'prime'}, {'account_id': 'a4', 'customer_id': 'c4', 'segment': 'prime'}], [{'transaction_id': 't1', 'account_id': 'a1', 'amount': 100, 'txn_ts': '2025-01-20 00:00:00'}, {'transaction_id': 't2', 'account_id': 'a2', 'amount': 100, 'txn_ts': '2025-01-20 00:00:00'}, {'transaction_id': 't3', 'account_id': 'a3', 'amount': 50, 'txn_ts': '2025-01-20 00:00:00'}, {'transaction_id': 't4', 'account_id': 'a4', 'amount': 40, 'txn_ts': '2025-01-20 00:00:00'}], '2025-01-31 00:00:00')

Expected Output: [('prime', 'c1', 100, 1), ('prime', 'c2', 100, 1), ('prime', 'c3', 50, 3)]

Explanation: Customers c1 and c2 tie for rank 1. SQL RANK skips rank 2, so c3 has rank 3 and c4 has rank 4. Only ranks at most 3 are returned.

Hints

  1. Build a mapping from account_id to its customer_id and segment before processing transactions.
  2. After aggregating totals by (segment, customer_id), sort each segment by total descending and assign SQL-style RANK values.
Last updated: Jun 17, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Solve Four Coding Assessment Tasks - Capital One (medium)
  • Review Preprocessing Code and Tests - Capital One (easy)
  • Remove nodes with a given value - Capital One (medium)
  • Solve multiple algorithmic interview questions - Capital One (hard)
  • Place Pieces on a Grid - Capital One (medium)