Compute specialty spend share and top age band

Q: Compute specialty spend share and top age band

This question evaluates data manipulation and aggregation skills in SQL and Pandas, including joins/unions across disjoint member tables, computation of specialty spend shares, identification of top age bands by claim counts with tie handling, year-based date filtering, and categorical recoding with monthly grouping.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

You are given healthcare claims data split across member tables.

Tables

Assume the following schemas (types may be adapted to your SQL dialect):

`mem1`

member_id INT PRIMARY KEY
age_band VARCHAR

`mem2`

member_id INT PRIMARY KEY
age_band VARCHAR

`claim`

claim_id INT PRIMARY KEY
member_id INT
specialty VARCHAR
paid_amt DECIMAL(12,2)
claim_date DATE

`risk` (provided but not necessarily needed)

member_id INT
risk_score DECIMAL(10,4)

Assume claim.member_id joins to either mem1.member_id or mem2.member_id. Assume claim_date is in UTC and “year = 2017” means calendar year 2017.

Part A (SQL)

For each specialty, compute:

total_paid_amt = total paid amount for that specialty
paid_share = that specialty’s share of total paid amount across all claims

Output columns: specialty, total_paid_amt, paid_share.

Part B (SQL)

Considering only claims in calendar year 2017:

Union mem1 and mem2 into a single member set.
Join members to claims.
Compute claim counts per age_band .
Return the age band(s) with the maximum number of claims (keep ties).

Output columns: age_band, claim_cnt.

Part C (Python / Pandas)

You are given a Pandas DataFrame df_claims with columns:

paid_amt (numeric)
gender (values are 'M' or 'F' )
claim_timestamp (string or datetime-like)

Tasks:

Recode gender from M/F to male/female .
Compute total paid_amt for each month in 2017 , grouped by gender and month.

Output: a DataFrame with columns like gender, month (1–12), total_paid_amt (and optionally year if you keep it).

Compute specialty spend share and top age band

Quick Overview

Tables

`mem1`

`mem2`

`claim`

`risk` (provided but not necessarily needed)

Part A (SQL)

Part B (SQL)

Part C (Python / Pandas)

Comments (0)

Compute specialty spend share and top age band

Quick Overview

Tables

mem1

mem2

claim

risk (provided but not necessarily needed)

Part A (SQL)

Part B (SQL)

Part C (Python / Pandas)

Comments (0)

`mem1`

`mem2`

`claim`

`risk` (provided but not necessarily needed)