How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

What difficulty level is this interview question?

This is a Medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at OneMain Financial.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at OneMain Financial during technical interviews.

Write SQL for cohort retention and ARPU | OneMain Financial Interview Question

Q: Write SQL for cohort retention and ARPU

This question evaluates a candidate's ability to perform time-based cohort analysis, compute D30 retention and 30-day ARPU, deduplicate exact event duplicates, and handle late or erroneous events using SQL.

OneMain Financial

Oct 13, 2025, 9:49 PM

Data Scientist

Technical Screen

Data Manipulation (SQL/Python)

1

0

Using PostgreSQL, compute monthly signup cohort D30 retention and 30-day ARPU. Cohort month = date_trunc('month', signup_date). D30 retention = users with any event on the calendar day signup_date + 30 days (ignore events before signup_date as data errors). 30D ARPU = sum of payments with pay_ts < signup_date + interval '30 days' divided by cohort size. Deduplicate exact duplicate events by (user_id, event_ts, event_name). Schema: users(user_id INT PRIMARY KEY, signup_date DATE, country TEXT) events(user_id INT, event_ts DATE, event_name TEXT) payments(user_id INT, pay_ts DATE, amount NUMERIC(10,2)) Sample data: users +---------+-------------+---------+ | user_id | signup_date | country | +---------+-------------+---------+ | 1 | 2025-06-01 | US | | 2 | 2025-06-15 | US | | 3 | 2025-07-03 | CA | | 4 | 2025-07-20 | US | | 5 | 2025-07-31 | GB | +---------+-------------+---------+ events +---------+------------+------------+ | user_id | event_ts | event_name | +---------+------------+------------+ | 1 | 2025-07-01 | login | | 1 | 2025-07-01 | login | (duplicate) | 2 | 2025-07-15 | purchase | | 3 | 2025-08-02 | login | | 4 | 2025-08-19 | browse | | 5 | 2025-08-30 | login | +---------+------------+------------+ payments +---------+------------+--------+ | user_id | pay_ts | amount | +---------+------------+--------+ | 1 | 2025-06-20 | 10.00 | | 1 | 2025-07-10 | 20.00 | | 2 | 2025-06-16 | 15.00 | | 3 | 2025-07-20 | 5.00 | | 5 | 2025-08-15 | 12.00 | +---------+------------+--------+ Write a single SQL query returning: cohort_month, d30_retention_rate, arpu_30d. Explain how your query avoids duplicates and late events.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More OneMain Financial•More Data Scientist•OneMain Financial Data Scientist•OneMain Financial Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)

Write SQL for cohort retention and ARPU

Quick Overview

Comments (0)

Write SQL for cohort retention and ARPU

Quick Overview

Comments (0)