Merge overlapping intervals per group in pandas

Q: Merge overlapping intervals per group in pandas

This question evaluates proficiency in time-series data manipulation and interval reasoning using pandas, including grouping, datetime handling, and interval unioning across partitions; it is commonly asked because merging temporal intervals tests practical data-cleaning abilities and handling of edge cases like overlaps, adjacency, sorting, and timezone-aware timestamps. It belongs to the Data Manipulation (SQL/Python) domain and represents a practical application that requires applied conceptual understanding of interval algebra and grouping operations rather than purely theoretical reasoning, assessing implementation-level skills in producing non-overlapping, sorted intervals per group.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

Loading...

You are given a pandas DataFrame df containing time intervals for multiple groups.

Input

df columns:

group_id (string/int): group identifier
start_ts (datetime): interval start timestamp (timezone-aware or all in the same timezone)
end_ts (datetime): interval end timestamp

Assumptions:

Each row represents a half-open interval [start_ts, end_ts) (so end_ts == next start_ts does not count as overlap unless you choose to treat adjacency as mergeable—state your choice).
start_ts < end_ts for all rows.
Intervals may be unsorted and may overlap within a group.

Task

Write pandas code to produce a new DataFrame containing merged (unioned) intervals within each group_id, such that:

Within each group, the output intervals are non-overlapping and sorted by time.
Any overlapping (and, if you choose, adjacent) intervals are merged.

Output

Return a DataFrame with columns:

group_id
merged_start_ts
merged_end_ts

(Optionally, also output merged_duration_seconds per merged interval if requested by the interviewer.)

Merge overlapping intervals per group in pandas

Quick Overview

Input

Task

Output

Comments (0)