How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

What difficulty level is this interview question?

This is a medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at WeRide.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at WeRide during technical interviews.

Compute window averages and merge intervals

Quick Overview

This question evaluates proficiency in pandas-based data manipulation—specifically windowed aggregations with strict boundary conditions and group-wise temporal interval merging—demonstrating skills in rolling/window operations, grouping, sorting, and handling time-based overlaps.

You are given two independent pandas tasks.

Sliding-window average
- Input DataFrame: df
- Schema:
  - row_id INT — unique row order key, already sorted ascending
  - value FLOAT
- Given an integer k >= 0 , compute for each row the average of the values from the previous k rows, the current row, and the next k rows.
- If a row does not have at least k previous rows and k next rows, set the output to -1 for that row.
- Return a DataFrame with columns: row_id , value , window_avg .
Merge overlapping autonomous-driving intervals
- Input DataFrame: segments
- Schema:
  - vehicle_id STRING
  - event_type STRING
  - start_ts TIMESTAMP
  - end_ts TIMESTAMP
- Assume all timestamps are in the same timezone and start_ts <= end_ts for every row.
- For each (vehicle_id, event_type) independently, merge intervals that overlap or touch, where a new interval should be merged into the current one if next.start_ts <= current.end_ts .
- Return the merged result with columns: vehicle_id , event_type , merged_start_ts , merged_end_ts , sorted by vehicle_id , event_type , merged_start_ts .

Write pandas code for both tasks.

Quick Overview

You are given two independent pandas tasks.

Sliding-window average
- Input DataFrame: df
- Schema:
  - row_id INT — unique row order key, already sorted ascending
  - value FLOAT
- Given an integer k >= 0 , compute for each row the average of the values from the previous k rows, the current row, and the next k rows.
- If a row does not have at least k previous rows and k next rows, set the output to -1 for that row.
- Return a DataFrame with columns: row_id , value , window_avg .
Merge overlapping autonomous-driving intervals
- Input DataFrame: segments
- Schema:
  - vehicle_id STRING
  - event_type STRING
  - start_ts TIMESTAMP
  - end_ts TIMESTAMP
- Assume all timestamps are in the same timezone and start_ts <= end_ts for every row.
- For each (vehicle_id, event_type) independently, merge intervals that overlap or touch, where a new interval should be merged into the current one if next.start_ts <= current.end_ts .
- Return the merged result with columns: vehicle_id , event_type , merged_start_ts , merged_end_ts , sorted by vehicle_id , event_type , merged_start_ts .

Write pandas code for both tasks.

Compute window averages and merge intervals

Quick Overview

Submit Your Answer

Compute window averages and merge intervals

Quick Overview

Submit Your Answer