Aggregate radiology spend and derive fiscal month

Q: Aggregate radiology spend and derive fiscal month

This is a Data Manipulation (SQL/Python) interview question from CVS Health for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

Using Python/pandas, complete the tasks below. Assume the following CSV-like input, where service_dt is object dtype and amounts may be negative (adjustments). Treat missing paid_amt as 0 when aggregating, but do not create NaNs during type conversion.

radiology.csv (toy data) claim_id,procedure_group,service_dt,paid_amt 1001,CT,2020-07-01 13:05:00,120.00 1002,MRI,2020-10-15 09:30:00,250.00 1003,CT,2020-10-20 11:00:00,-20.00 1004,XRay,2020-12-05 08:00:00,80.00 1005,MRI,2020-01-02 14:10:00,

Tasks

Load the file robustly (explicit dtypes, parse_dates) and aggregate paid_amt by procedure_group to produce columns: procedure_group, paid_amt_sum (rounded to 2 decimals).
Add each group's percentage of total paid_amt as pct_of_total (0–100 with two decimals). Ensure total uses post-cleaning values and is not double-counted.
Convert service_dt from object to datetime and add an integer fiscal_month column where fiscal year starts on October 1 (Oct=1, Nov=2, …, Sep=12). Validate on the sample rows so that 2020-10-15 maps to fiscal_month=2. Show the resulting dtypes and demonstrate that the transformation is vectorized (no per-row Python loops). Handle bad or missing dates gracefully (coerce to NaT, then impute fiscal_month with 0 for unknown).

Aggregate radiology spend and derive fiscal month

Comments (0)