Compute daily work hours from in/out events
Company: Amazon
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
Given punch events, compute each employee’s daily hours, handling unmatched events and overnight shifts. Write SQL over:
events(employee_id INT, evt_ts TIMESTAMP, action VARCHAR CHECK(action IN ('in','out')))
Sample rows:
101 | 2025-01-30 08:00 | in
101 | 2025-01-30 13:00 | out
101 | 2025-01-30 14:00 | in
101 | 2025-01-30 18:30 | out
102 | 2025-01-30 22:00 | in
102 | 2025-01-31 06:00 | out
103 | 2025-01-30 09:00 | in
103 | 2025-01-30 12:00 | out
103 | 2025-01-30 12:30 | out -- duplicate/misordered
Requirements: (a) pair each 'in' with the next 'out' for the same employee; (b) split work that crosses midnight into the appropriate calendar days; (c) ignore orphan 'out' events and cap trailing unmatched 'in' at 23:59:59 of that day; (d) sum hours per employee_id per work_date with rounding to nearest 0.25 hr; (e) flag days with data-quality issues (overlaps, consecutive ins/outs). Return columns: employee_id, work_date, hours_worked, dq_issue_flag.
Quick Answer: This question evaluates a candidate's ability to manipulate time-series punch-event data, including temporal pairing of in/out events, splitting shifts across calendar days, rounding and aggregating hours, and detecting data-quality issues such as overlaps or orphaned events using SQL or Python.