How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

What difficulty level is this interview question?

This is a easy difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at Nuro.

What role is this question designed for?

This question is commonly asked for Data Engineer candidates at Nuro during technical interviews.

Compute unique duration by merging intervals

Q: Compute unique duration by merging intervals

This question evaluates a data engineer's proficiency in temporal data manipulation and interval-set reasoning using SQL and Python, specifically skills in merging overlapping intervals and computing deduplicated cumulative durations.

Problem

You have an autonomous-driving clip dataset where each row represents a time interval from a driving run.

Table: `clips`

column	type	description
`set_id`	INT	Session/clip set identifier (e.g., 1–6)
`run_id`	STRING	Driving run identifier, formatted like `YYYYMMDD_HHMMSS`
`start_time`	INT	Clip start time offset (e.g., seconds)
`end_time`	INT	Clip end time offset (e.g., seconds)
`set_number`	INT	Dataset/source identifier

Assume each interval is half-open: [start_time, end_time) so duration is end_time - start_time.

Intervals for the same run_id may:

overlap (partially or fully),
be duplicated/redundant,
have gaps (non-contiguous).

Task

Compute, for each run_id, the cumulative deduplicated time: the total duration covered by the union of all intervals for that run_id (i.e., after merging overlaps).

Example input

For run 20250101_122334:

set 1: [58, 70)
set 2: [57, 69)
set 3: [55, 72)
set 4: [56, 71)
set 5: [80, 100)

For run 20250102_101010:

set 6: [43, 62)

Expected output

Return a result with:

run_id
cumulative_unique_time (same unit as the time offsets)

Using the example above, results would be:

20250101_122334 → 39
20250102_101010 → 19

Requirements

Write a SQL solution to compute cumulative_unique_time per run_id .
Write a Python solution to compute the same metric (given a list/dataframe of intervals).

Problem

You have an autonomous-driving clip dataset where each row represents a time interval from a driving run.

Table: clips

column

type

description

set_id

INT

Session/clip set identifier (e.g., 1–6)

run_id

STRING

Driving run identifier, formatted like YYYYMMDD_HHMMSS

start_time

INT

Clip start time offset (e.g., seconds)

end_time

INT

Clip end time offset (e.g., seconds)

set_number

INT

Dataset/source identifier

Assume each interval is half-open: [start_time, end_time) so duration is end_time - start_time.

Intervals for the same run_id may:

overlap (partially or fully),

be duplicated/redundant,

have gaps (non-contiguous).

Compute unique duration by merging intervals

Quick Overview

Problem

Table: `clips`

Task

Example input

Expected output

Requirements

Comments (0)

Compute unique duration by merging intervals

Quick Overview

Problem

Table: `clips`

Task

Example input

Expected output

Requirements

Comments (0)

Compute unique duration by merging intervals

Quick Overview