Healthcare Claims, Cohort, And Spend Analytics
Asked of: Data Scientist
Last updated
What's being tested
Candidates must show practical cohort analysis and claims spend aggregation skills using SQL or Python: correct deduplication, time-bucketing (calendar vs fiscal), and clean YoY/percentage computations. Interviewers probe for robust data-manipulation idioms (joins, window functions, group-by) and defensible handling of NULLs, zeros, and attribution windows.
Patterns & templates
- Last-event-per-claim / de-dup using window functions:
`ROW_NUMBER()`OVER (PARTITION BY claim_id ORDER BY paid_date DESC) to keep canonical record. - Conditional aggregation for percentages:
`SUM(CASE WHEN condition THEN amount ELSE 0 END)` `/ NULLIF(SUM(amount),0)to avoid divide-by-zero. - Cohort-by-first-event: assign cohort as
`DATE_TRUNC('year', first_service_date)`then`GROUP BY`cohort, fiscal_month for retention/spend. - Fiscal month derivation:
`(EXTRACT(month FROM date) + offset -1) % 12 +1`or`date_trunc('month', date - interval 'X months')`for fiscal-year shifts. - Pandas equivalents:
`pd.to_datetime`,`df.drop_duplicates(subset='claim_id', keep='last')`,`df.groupby(['cohort','year']).agg(...)`and`pivot_table`for cross-tabs. - Performance rule of thumb: single-pass aggregations are O(N); avoid repeated joins on large tables—pre-aggregate before joining.
- Edge-case joins: prefer
`LEFT JOIN`+ IS NULL checks to detect missing enrollment/payer rows; use`COUNT(DISTINCT member_id)`carefully for large cardinality.
Common pitfalls
Pitfall: Using raw
`COUNT(*)`instead of`COUNT(DISTINCT)`when duplicates exist, inflating denominators and misreporting percentages.
Pitfall: Failing to define cohort anchoring (first claim vs first paid) so YoY comparisons mix different populations.
Pitfall: Dividing by zero when prior-year spend is zero—always wrap with
`NULLIF(...,0)`or explicit guard logic.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
- Design an email flu-shot experimentCVS Health · Data Scientist · Technical Screen · hard
- Aggregate radiology spend and derive fiscal monthCVS Health · Data Scientist · Technical Screen · Medium
- Compute age-band spend and YoY in GeorgiaCVS Health · Data Scientist · Technical Screen · Medium
- Test payment-accuracy lift with p-value and powerCVS Health · Data Scientist · Technical Screen · medium
- Calculate annual percentages and YoY by cohortsCVS Health · Data Scientist · Technical Screen · Medium
- Design a flu-shot A/B/n campaign experimentCVS Health · Data Scientist · Technical Screen · hard
- Calculate Medical Claims by Age and Gender in 2024CVS Health · Data Scientist · Technical Screen · Medium