PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/Cvs

Compute specialty spend share and top age band

Last updated: Mar 29, 2026

Quick Overview

This question evaluates data manipulation and aggregation skills in SQL and Pandas, including joins/unions across disjoint member tables, computation of specialty spend shares, identification of top age bands by claim counts with tie handling, year-based date filtering, and categorical recoding with monthly grouping.

  • easy
  • Cvs
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute specialty spend share and top age band

Company: Cvs

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

You are given healthcare claims data split across member tables. ## Tables Assume the following schemas (types may be adapted to your SQL dialect): ### `mem1` - `member_id` INT PRIMARY KEY - `age_band` VARCHAR ### `mem2` - `member_id` INT PRIMARY KEY - `age_band` VARCHAR ### `claim` - `claim_id` INT PRIMARY KEY - `member_id` INT - `specialty` VARCHAR - `paid_amt` DECIMAL(12,2) - `claim_date` DATE ### `risk` (provided but not necessarily needed) - `member_id` INT - `risk_score` DECIMAL(10,4) Assume `claim.member_id` joins to either `mem1.member_id` or `mem2.member_id`. Assume `claim_date` is in UTC and “year = 2017” means calendar year 2017. --- ## Part A (SQL) For each `specialty`, compute: 1) `total_paid_amt` = total paid amount for that specialty 2) `paid_share` = that specialty’s share of total paid amount across all claims **Output columns:** `specialty`, `total_paid_amt`, `paid_share`. --- ## Part B (SQL) Considering only claims in calendar year 2017: 1) Union `mem1` and `mem2` into a single member set. 2) Join members to claims. 3) Compute claim counts per `age_band`. 4) Return the age band(s) with the maximum number of claims (keep ties). **Output columns:** `age_band`, `claim_cnt`. --- ## Part C (Python / Pandas) You are given a Pandas DataFrame `df_claims` with columns: - `paid_amt` (numeric) - `gender` (values are `'M'` or `'F'`) - `claim_timestamp` (string or datetime-like) Tasks: 1) Recode `gender` from `M/F` to `male/female`. 2) Compute total `paid_amt` for **each month in 2017**, grouped by `gender` and month. **Output:** a DataFrame with columns like `gender`, `month` (1–12), `total_paid_amt` (and optionally `year` if you keep it).

Quick Answer: This question evaluates data manipulation and aggregation skills in SQL and Pandas, including joins/unions across disjoint member tables, computation of specialty spend shares, identification of top age bands by claim counts with tie handling, year-based date filtering, and categorical recoding with monthly grouping.

Cvs logo
Cvs
Oct 17, 2025, 12:00 AM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
6
0

You are given healthcare claims data split across member tables.

Tables

Assume the following schemas (types may be adapted to your SQL dialect):

mem1

  • member_id INT PRIMARY KEY
  • age_band VARCHAR

mem2

  • member_id INT PRIMARY KEY
  • age_band VARCHAR

claim

  • claim_id INT PRIMARY KEY
  • member_id INT
  • specialty VARCHAR
  • paid_amt DECIMAL(12,2)
  • claim_date DATE

risk (provided but not necessarily needed)

  • member_id INT
  • risk_score DECIMAL(10,4)

Assume claim.member_id joins to either mem1.member_id or mem2.member_id. Assume claim_date is in UTC and “year = 2017” means calendar year 2017.

Part A (SQL)

For each specialty, compute:

  1. total_paid_amt = total paid amount for that specialty
  2. paid_share = that specialty’s share of total paid amount across all claims

Output columns: specialty, total_paid_amt, paid_share.

Part B (SQL)

Considering only claims in calendar year 2017:

  1. Union mem1 and mem2 into a single member set.
  2. Join members to claims.
  3. Compute claim counts per age_band .
  4. Return the age band(s) with the maximum number of claims (keep ties).

Output columns: age_band, claim_cnt.

Part C (Python / Pandas)

You are given a Pandas DataFrame df_claims with columns:

  • paid_amt (numeric)
  • gender (values are 'M' or 'F' )
  • claim_timestamp (string or datetime-like)

Tasks:

  1. Recode gender from M/F to male/female .
  2. Compute total paid_amt for each month in 2017 , grouped by gender and month.

Output: a DataFrame with columns like gender, month (1–12), total_paid_amt (and optionally year if you keep it).

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Cvs•More Data Scientist•Cvs Data Scientist•Cvs Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.