PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/Databricks

Count weekly customers with ≥$1000 YTD spend

Last updated: May 3, 2026

Quick Overview

This question evaluates proficiency in time-series aggregation, cumulative year-to-date computations, distinct-count deduplication, and use of SQL or pandas for data manipulation and reporting.

  • easy
  • Databricks
  • Data Manipulation (SQL/Python)
  • Data Scientist

Count weekly customers with ≥$1000 YTD spend

Company: Databricks

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

You are given a transaction-level table and need to compute a weekly time series. ## Table **transactions** - `date` (DATE) — transaction date (assume UTC and calendar dates) - `transaction_id` (STRING/INT) — primary key - `customer_id` (STRING/INT) - `dollars` (NUMERIC) — non-negative transaction amount in USD ## Definitions - **Week**: calendar week starting on Monday (you may state a different week boundary, but be consistent). - **Year-to-date (YTD) spend as of a week**: for a given `customer_id` and a given week `w` within year `Y`, the sum of `dollars` from **Jan 1 of year Y** up to and including the last day of week `w`. ## Task For each week present in the data, compute the **number of distinct customers** whose **YTD spend as of that week** is **at least $1000**. ## Output Return a table with: - `week_start` (DATE) - `num_customers_ge_1000_ytd` (INT) You may solve this in **SQL** or **Python (pandas)**.

Quick Answer: This question evaluates proficiency in time-series aggregation, cumulative year-to-date computations, distinct-count deduplication, and use of SQL or pandas for data manipulation and reporting.

Related Interview Questions

  • Find Top-5 Similar Rows - Databricks (hard)
  • Count weekly customers with YTD spend ≥ $1000 - Databricks (hard)
  • Find top-5 most similar rows across datasets - Databricks (easy)
  • Calculate Second-Degree Followers for Each YouTuber - Databricks (Medium)
Databricks logo
Databricks
Dec 3, 2025, 12:00 AM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
5
0

You are given a transaction-level table and need to compute a weekly time series.

Table

transactions

  • date (DATE) — transaction date (assume UTC and calendar dates)
  • transaction_id (STRING/INT) — primary key
  • customer_id (STRING/INT)
  • dollars (NUMERIC) — non-negative transaction amount in USD

Definitions

  • Week : calendar week starting on Monday (you may state a different week boundary, but be consistent).
  • Year-to-date (YTD) spend as of a week : for a given customer_id and a given week w within year Y , the sum of dollars from Jan 1 of year Y up to and including the last day of week w .

Task

For each week present in the data, compute the number of distinct customers whose YTD spend as of that week is at least $1000.

Output

Return a table with:

  • week_start (DATE)
  • num_customers_ge_1000_ytd (INT)

You may solve this in SQL or Python (pandas).

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Databricks•More Data Scientist•Databricks Data Scientist•Databricks Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.