Design MapReduce for schedules

Q: Design MapReduce for schedules

This is a System Design interview question from Uber for Software Engineer roles. View the full question and solution on PracHub.

Q: How do I approach System Design interview questions?

System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master system design interviews.

Question

Design a scalable job to compute rolling 15-minute meeting concurrency and room utilization

You are given a massive log of meeting events with fields:

user_id
start_time (UTC)
end_time (UTC)
room_id
office_id

Compute, for each rolling 15-minute window (assume slide = 1 minute unless specified otherwise):

Concurrent meeting counts per office (report peak concurrent meetings within each 15-minute window).
Rooms whose utilization exceeds 80% within the window, where utilization = occupied_time_in_window / 900 seconds.

Design a MapReduce or Spark job that:

Specifies map keys/values and how you transform intervals into windowed aggregates (e.g., line-sweep with +1/−1 at boundaries vs. interval splitting into windows).
Describes the shuffle/partition strategy to minimize skew and bound reducer work.
Explains handling of late and duplicated events (event-time watermarks, dedup keys, update/retraction strategy).
Proposes validation and correctness checks at scale.

Make minimal, explicit assumptions if needed (e.g., presence of meeting_id for dedup, window slide granularity).

Design MapReduce for schedules

Design a scalable job to compute rolling 15-minute meeting concurrency and room utilization

Solution (Locked)

Comments (0)