PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Instacart

Implement bus route simulation features

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's skills in stateful event-driven simulation, data-structure and algorithm design for capacity-constrained boarding and dropping logic, and statistical downsampling to preserve origin–destination demand distributions.

  • medium
  • Instacart
  • Coding & Algorithms
  • Software Engineer

Implement bus route simulation features

Company: Instacart

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Onsite

### Bus Route Simulation – Boarding, Capacity, and Priority Passes You are given a partially implemented codebase for simulating bus routes in a city. Buses travel along fixed routes with scheduled arrival times at each stop. Passengers appear at stops and want to travel to other stops. Each **bus** has: - A unique `bus_id`. - A fixed **capacity** `C` (maximum number of passengers allowed on board at any time). - A **schedule**: an ordered list of `(time, stop_id)` pairs indicating when the bus arrives at each stop. Each **passenger** has: - A unique `passenger_id`. - `appear_time`: when the passenger appears at a stop. - `from_stop`: the stop where they start. - `to_stop`: the stop where they want to get off. - `has_priority_pass`: a boolean indicating whether the passenger has a priority pass. The simulation runs in **discrete time**. At each scheduled bus arrival `(time, stop_id)` for a bus: 1. The bus **drops off** all passengers currently on the bus whose `to_stop` equals `stop_id`. 2. The bus **picks up** waiting passengers at `stop_id` whose `appear_time <= time` and who have not yet boarded any bus. 3. The bus must **never exceed capacity** `C`. 4. You must **track and update** the number of passengers on board after each drop-off and pick-up. The codebase defines basic data structures like `BusSchedule` and `PassengerEvent`, but several key functions are left for you to implement. You may assume: - Number of passenger events `E` is up to 100,000. - Number of buses `B` is up to 1,000. - Each bus schedule has at most 1,000 stops. - All times are non-negative integers and you can assume that bus schedules and passenger events are available in memory. #### Task 1 – Reduce Simulation Population The full list of passenger events may be very large, making the simulation slow. Implement a function to **downsample** the passenger events to reduce the number of simulated passengers while approximately preserving the demand distribution across origin–destination pairs. API (conceptual): ```text List<PassengerEvent> downsamplePassengers(List<PassengerEvent> events, double factor) ``` where: - `0 < factor <= 1` is a scale factor (e.g., `0.1` means simulate ~10% of the passengers). - The result should: - Contain roughly `factor * events.size()` passengers. - Preserve the distribution of passengers across `(from_stop, to_stop)` pairs as much as reasonably possible. Specify clearly how you choose which passengers to keep so that the distribution is roughly preserved. #### Task 2 – Boarding and Dropping Logic with Capacity Tracking Implement the main simulation logic that handles bus boarding and dropping, ensuring capacities are never exceeded and on-board counts are correctly tracked. API (conceptual): ```text List<LogEntry> simulate( List<BusSchedule> busSchedules, List<PassengerEvent> passengers, int capacity ) ``` where each `LogEntry` has: - `time` - `bus_id` - `stop_id` - `passenger_id` - `action` in {"board", "drop"} Simulation rules: - For each `(time, stop_id)` in each `BusSchedule`, process events in **time order**. - At each bus arrival: 1. Drop off all passengers on that bus whose `to_stop == stop_id` (emit `"drop"` log entries). 2. Among waiting passengers at `stop_id` with `appear_time <= time` and not yet boarded any bus, pick up passengers **without ever exceeding capacity**. - Initially, all buses have zero passengers. - You must maintain and update the **current passenger count** for each bus after each `board`/`drop`. Your implementation should be efficient enough to handle the stated constraints. #### Task 3 – Validate Logs and Find Bugs You are given a log generated by some (possibly buggy) version of the simulation. You need to verify whether the log respects the bus capacity constraints and basic consistency rules. API (conceptual): ```text boolean validateLog(List<LogEntry> log, int capacity) ``` The validator should check at least the following: - Bus occupancy for any bus **never becomes negative**. - Bus occupancy for any bus **never exceeds** `capacity`. - A passenger **cannot drop off before boarding**. - A passenger **cannot board more than once** (i.e., cannot be on multiple buses or board the same bus twice). You may also return the first error found (e.g., by returning an error description instead of just `boolean`), but at minimum you must be able to detect whether the log is valid or not. Describe the state you need to track while scanning the log and how you update it. #### Task 4 – Support Priority Passengers Extend your boarding logic in **Task 2** to support priority passes: - Passengers with `has_priority_pass == true` should be boarded **before** non-priority passengers when a bus arrives at a stop. - Within the same priority level (priority vs non-priority), passengers should board in **arrival order** (`appear_time` ascending, breaking ties consistently, e.g., by `passenger_id`). - Capacity constraints must still be strictly enforced. Update `simulate(...)` so that log entries reflect this behavior, and the on-board passenger counts remain correct. For all tasks, clearly define and use appropriate data structures, and design your solution to run efficiently for the given input sizes.

Quick Answer: This question evaluates a candidate's skills in stateful event-driven simulation, data-structure and algorithm design for capacity-constrained boarding and dropping logic, and statistical downsampling to preserve origin–destination demand distributions.

Solution

### Overview and shared data model This is a four-part design-and-implement problem. Before tackling each task it helps to nail down the data structures, because all four tasks share them and the right structures make the boarding logic and the validator nearly fall out for free. ```python from dataclasses import dataclass from collections import defaultdict, deque @dataclass class PassengerEvent: passenger_id: int appear_time: int from_stop: int to_stop: int has_priority_pass: bool = False @dataclass class BusSchedule: bus_id: int capacity: int # per the question, each bus has a capacity C stops: list # ordered list of (time, stop_id), time-sorted @dataclass class LogEntry: time: int bus_id: int stop_id: int passenger_id: int action: str # "board" | "drop" ``` Key modeling decisions that drive everything below: - **Passengers waiting at a stop** are kept in a per-stop structure keyed by `from_stop`, because pickup only ever queries one stop at a time. We sort each stop's waiting list by `appear_time` once, then walk it with a cursor so that across all arrivals at a stop each passenger is *considered* O(1) amortized rather than re-scanned. - **Passengers on a bus** are bucketed by `to_stop`, so drop-off at `stop_id` is a direct lookup rather than a scan of the whole bus. - A global `boarded` set (passenger_ids) enforces "has not yet boarded any bus." The dominant constraints are $E \le 100{,}000$, $B \le 1000$, and $\le 1000$ stops/bus. So $B \cdot S \le 10^6$ arrival events, and each passenger boards/drops at most once $\rightarrow$ at most $2E$ log entries. The structures below keep every hot operation $O(1)$ amortized, so the total stays near $O((E + B\cdot S)\log)$ (the $\log$ is just the up-front sorts), which is comfortable. A note on the `capacity` argument. The question's `simulate(...)` signature passes a single global `capacity`, and so does `validateLog(...)`. The `BusSchedule` definition also says each bus has a capacity `C`. To stay faithful to the given APIs (and to keep `simulate` and `validateLog` consistent — Task 3 only knows one global value), the code below uses the **global `capacity` argument** as the cap for every bus. If the interviewer wants genuinely heterogeneous per-bus capacities, see the explicit follow-up at the end of Task 2. --- ### Task 1 — Downsample passengers while preserving the O–D distribution **Goal.** Return roughly `factor * |events|` passengers while keeping the proportion of passengers in each `(from_stop, to_stop)` origin–destination bucket close to the original. A naïve uniform random sample (keep each event with probability `factor`) preserves the distribution *in expectation*, but for rare O–D pairs it has high variance — a pair with 3 passengers and `factor = 0.1` is likely to drop to 0 entirely, distorting the demand shape. **Approach — stratified sampling.** Stratify by the `(from_stop, to_stop)` key and sample *within* each stratum. This keeps each group's share stable even for small groups. For a group of size `n`, the target keep count is `factor * n`, which is usually fractional. Round probabilistically so the expected total is exactly `factor * |events|` and small groups aren't systematically lost: - `keep = floor(factor * n)`, then keep one extra with probability `frac(factor * n)`. This guarantees every non-trivial group contributes at least its fair fractional share, and groups of size 1 survive with probability `factor` (correct) rather than being deterministically dropped. ```python import random import math def downsample_passengers(events, factor, seed=None): assert 0 < factor <= 1 if factor == 1: return list(events) rng = random.Random(seed) # 1. Stratify by O-D pair. groups = defaultdict(list) for e in events: groups[(e.from_stop, e.to_stop)].append(e) result = [] for key, members in groups.items(): target = factor * len(members) k = int(math.floor(target)) # Probabilistic rounding of the fractional remainder. if rng.random() < (target - k): k += 1 k = min(k, len(members)) if k > 0: # 2. Sample k uniformly within the stratum. result.extend(rng.sample(members, k)) return result ``` **Complexity.** Grouping is $O(E)$; `random.sample` over all groups is $O(E)$ total. Memory $O(E)$. **Trade-offs and follow-ups worth mentioning:** - If interviewer wants a *deterministic, reproducible* sample, replace per-group random sampling with deterministic strided selection: sort each group by `passenger_id` and keep indices `0, 1/factor, 2/factor, ...`. Same distribution-preserving property, no RNG. - For very large numbers of unique O–D pairs (many singleton groups), the probabilistic rounding is what keeps the total honest; pure `floor` would systematically undersample. - If we also wanted to preserve the *temporal* demand profile (rush-hour shape), stratify on `(from_stop, to_stop, time_bucket)` instead. Same machinery, finer strata. - This is sampling without replacement within strata, so we never duplicate a passenger_id — important because Task 2/3 treat passenger_id as unique. --- ### Task 2 — Boarding / drop-off with capacity tracking **Core challenge.** "Process events in time order" across *all* buses, while each stop's waiting passengers are shared (a passenger boards the *first* bus that can take them). So we cannot simulate buses independently — we must merge all `(time, stop_id, bus_id)` arrivals into one global time-ordered stream. If two buses arrive at the same `(time, stop_id)`, break ties deterministically (e.g., by `bus_id`) so the result is reproducible. **Data structures:** - `sorted_wait[stop_id]`: list of waiting passengers, sorted once by `(appear_time, passenger_id)`. - `cursor[stop_id]`: index into `sorted_wait[stop_id]` of the first passenger not yet *released* into the ready queue. Because arrivals at a stop are processed in non-decreasing time and the list is time-sorted, the cursor only moves forward across the whole simulation. - `ready[stop_id]`: a `deque` of passengers who are eligible (`appear_time <= t`) but have **not yet boarded**. This is what makes the inner loop honest: a passenger left behind because a bus was full simply stays at the front of `ready` and is the first candidate for the *next* bus, with no re-scan of the people behind them. Each passenger is appended to `ready` exactly once (when first eligible) and removed exactly once (when boarded) → $O(1)$ amortized per passenger. - `on_bus[bus_id]`: `dict[to_stop -> list[passenger_id]]` so drop-off is a keyed lookup. - `count[bus_id]`: current occupancy, kept explicitly for $O(1)$ capacity checks and as the value we "track and update." ```python def simulate(bus_schedules, passengers, capacity): # Per-stop waiting lists, sorted by (appear_time, passenger_id). sorted_wait = defaultdict(list) for p in passengers: sorted_wait[p.from_stop].append(p) for stop in sorted_wait: sorted_wait[stop].sort(key=lambda p: (p.appear_time, p.passenger_id)) cursor = defaultdict(int) # stop_id -> next un-released index ready = defaultdict(deque) # stop_id -> eligible, unboarded on_bus = defaultdict(lambda: defaultdict(list)) # bus_id -> to_stop -> [pid] count = defaultdict(int) # bus_id -> occupancy log = [] # Merge all arrivals into one time-ordered stream. # Tie-break by (time, stop_id, bus_id) for determinism. arrivals = [] for bs in bus_schedules: for (t, stop_id) in bs.stops: arrivals.append((t, stop_id, bs.bus_id)) arrivals.sort(key=lambda a: (a[0], a[1], a[2])) def release_eligible(stop_id, t): # Move all passengers whose appear_time <= t into the ready deque. lst = sorted_wait.get(stop_id) if not lst: return i = cursor[stop_id] n = len(lst) while i < n and lst[i].appear_time <= t: ready[stop_id].append(lst[i]) i += 1 cursor[stop_id] = i for (t, stop_id, bus_id) in arrivals: # --- 1. Drop off everyone on this bus whose to_stop == stop_id --- riders = on_bus[bus_id].pop(stop_id, []) for pid in riders: count[bus_id] -= 1 log.append(LogEntry(t, bus_id, stop_id, pid, "drop")) # --- 2. Pick up eligible, not-yet-boarded passengers, in order --- release_eligible(stop_id, t) q = ready[stop_id] while q and count[bus_id] < capacity: p = q.popleft() # boarding removes them from the queue on_bus[bus_id][p.to_stop].append(p.passenger_id) count[bus_id] += 1 log.append(LogEntry(t, bus_id, stop_id, p.passenger_id, "board")) return log ``` **Why this is correct:** - Drop-off before pickup matches the rules and frees seats for the same stop's boarders. - `release_eligible` only ever moves passengers with `appear_time <= t` into `ready`; the cursor never goes backward, and someone whose `appear_time` is still in the future stays out of `ready` until a later arrival makes them eligible. - A passenger is removed from `ready` *only* by boarding (`popleft`), so a bus that fills up leaves the remaining eligible passengers exactly where they were — the next bus boards them first, in correct order, with no double-boarding (they're physically no longer in any queue once boarded). - Capacity is checked *before* each board (`count[bus_id] < capacity`), never after, so we never exceed `capacity`. **Complexity.** Sorting waiting lists: $O(E\log E)$. Sorting arrivals: $O(B\cdot S\log(B\cdot S))$. Each passenger is released into `ready` once and popped once, so all pickup work across the whole run is $O(E)$ total — there is no re-scan of left-behind passengers. Each drop-off is a single keyed `pop`. Overall $O((E + B\cdot S)\log)$ time, $O(E + B\cdot S)$ memory. With $E \le 100{,}000$ this runs comfortably even when one hot stop receives many arrivals, because the per-arrival pickup cost is bounded by the number of passengers it actually boards plus a constant. **Per-bus capacity follow-up.** If the spec really wanted heterogeneous capacities, the only change is to compare against the bus's own capacity instead of the global one. You must read it explicitly (not via a falsy check), so a legitimate `capacity == 0` bus is respected rather than silently falling back to a global value: ```python cap = bs.capacity # carried alongside bus_id in the arrival tuple ... while q and count[bus_id] < cap: ... ``` But this would also require `validateLog` to receive per-bus capacities to stay consistent, so it is intentionally **not** the default here — the given APIs use one global `capacity`. **Edge cases to call out:** empty inputs; a stop with no bus ever visiting (those passengers never board — correct); a bus arriving at a stop with no waiting/no riders (no-ops); `capacity == 0` (the `count < capacity` guard is `0 < 0` → false, so no one ever boards — correct); duplicate `(time, stop)` arrivals across buses (handled by deterministic tie-break); a passenger whose `from_stop == to_stop` (boards, then drops the next time that bus hits the stop — worth clarifying the spec with the interviewer). --- ### Task 3 — Validate a (possibly buggy) log **State to track while scanning the log in order:** - `occupancy[bus_id]` — running count, `+1` on board, `−1` on drop. - `current_bus[pid]` — which bus a passenger is currently on (`None` if not on any). - `ever_boarded[pid]` — has this passenger boarded *any* bus at least once (to forbid re-boarding after dropping, per the "board at most once / not on multiple buses" rule). **Checks, applied as each entry is processed:** | Rule | How to detect | |------|---------------| | Occupancy never exceeds capacity | After a `board`, `occupancy[bus] > capacity` → invalid | | Occupancy never negative | A `drop` when `occupancy[bus] == 0` → invalid | | No drop before board | `drop` for a passenger whose `current_bus[pid] != this bus` → invalid | | No double board | `board` for a passenger already in `ever_boarded` (or already on a bus) → invalid | | Drop matches the bus they're on | `drop` bus_id must equal `current_bus[pid]` | ```python def validate_log(log, capacity): occupancy = defaultdict(int) current_bus = {} # pid -> bus_id they're currently on ever_boarded = set() # pids that have boarded at least once for i, e in enumerate(log): if e.action == "board": if e.passenger_id in ever_boarded: return (False, f"entry {i}: passenger {e.passenger_id} boards more than once") occupancy[e.bus_id] += 1 if occupancy[e.bus_id] > capacity: return (False, f"entry {i}: bus {e.bus_id} exceeds capacity {capacity}") current_bus[e.passenger_id] = e.bus_id ever_boarded.add(e.passenger_id) elif e.action == "drop": if current_bus.get(e.passenger_id) != e.bus_id: return (False, f"entry {i}: passenger {e.passenger_id} drops without being on bus {e.bus_id}") occupancy[e.bus_id] -= 1 if occupancy[e.bus_id] < 0: return (False, f"entry {i}: bus {e.bus_id} occupancy negative") current_bus[e.passenger_id] = None else: return (False, f"entry {i}: unknown action {e.action!r}") return (True, None) ``` **Notes:** - This validator and `simulate` share the **same single global `capacity`**, so a log produced by `simulate` validates cleanly — the two tasks are consistent by construction. - Returning `(ok, message)` is strictly more useful than a bare `bool` and was explicitly allowed; the first error short-circuits. - `current_bus` collapses three checks (no drop before board, drop on the right bus, not on two buses) into one invariant: the passenger's recorded current bus must match. - The "board at most once, ever" rule (`ever_boarded`) is what makes re-boarding after a legitimate drop also invalid — matches the simulation semantics where a passenger travels exactly once. - Complexity $O(N)$ over $N$ log entries, $O(P + B)$ memory. Single pass; no need to sort if the log is already in processing order. If the log might be unordered, sort by `time` first (tie-break board-before-drop or by original index) — but mention the ambiguity, since occupancy validity depends on order. --- ### Task 4 — Priority passengers **Requirement.** At each arrival, board *all eligible priority-pass holders first*, then non-priority, each group in `(appear_time, passenger_id)` order, still never exceeding capacity. **Minimal change.** The only thing that changes is the *order* in which we consider the waiting queue at a stop. We can't fold priority into a single sort key without breaking the cursor: the cursor relies on the list being purely time-monotone so eligibility (`appear_time <= t`) advances cleanly. A composite key like `(priority, appear_time, pid)` would interleave a future-appearing priority passenger before an already-eligible regular one, breaking that invariant. Cleanest fix: keep **two** per-stop pipelines (priority, regular), each with its own sorted list + cursor + ready deque, and drain priority first. Both stay individually time-monotone, so the $O(1)$-amortized release/board mechanism from Task 2 carries over unchanged. ```python def simulate_with_priority(bus_schedules, passengers, capacity): sorted_pri = defaultdict(list) sorted_reg = defaultdict(list) for p in passengers: (sorted_pri if p.has_priority_pass else sorted_reg)[p.from_stop].append(p) for d in (sorted_pri, sorted_reg): for stop in d: d[stop].sort(key=lambda p: (p.appear_time, p.passenger_id)) cur_pri = defaultdict(int) cur_reg = defaultdict(int) ready_pri = defaultdict(deque) ready_reg = defaultdict(deque) on_bus = defaultdict(lambda: defaultdict(list)) count = defaultdict(int) log = [] arrivals = [] for bs in bus_schedules: for (t, stop_id) in bs.stops: arrivals.append((t, stop_id, bs.bus_id)) arrivals.sort(key=lambda a: (a[0], a[1], a[2])) def release(sorted_d, cur_d, ready_d, stop_id, t): lst = sorted_d.get(stop_id) if not lst: return i, n = cur_d[stop_id], len(lst) while i < n and lst[i].appear_time <= t: ready_d[stop_id].append(lst[i]) i += 1 cur_d[stop_id] = i def board_from(ready_d, t, stop_id, bus_id): q = ready_d[stop_id] while q and count[bus_id] < capacity: p = q.popleft() on_bus[bus_id][p.to_stop].append(p.passenger_id) count[bus_id] += 1 log.append(LogEntry(t, bus_id, stop_id, p.passenger_id, "board")) for (t, stop_id, bus_id) in arrivals: for pid in on_bus[bus_id].pop(stop_id, []): count[bus_id] -= 1 log.append(LogEntry(t, bus_id, stop_id, pid, "drop")) release(sorted_pri, cur_pri, ready_pri, stop_id, t) release(sorted_reg, cur_reg, ready_reg, stop_id, t) board_from(ready_pri, t, stop_id, bus_id) # priority first board_from(ready_reg, t, stop_id, bus_id) # then regular return log ``` **Why two pipelines, not a heap?** A per-stop priority heap keyed by `(priority, appear_time, pid)` also works, but it complicates the cursor/eligibility logic and the "this passenger already boarded" handling. Two flat sorted lists with their own ready deques are simpler, give the exact tie-break the spec asks for, and keep each queue individually time-monotone so the $O(1)$-amortized release stays valid. **Invariant preserved.** Capacity is still checked before every board; priority only reorders *who* gets the scarce seats, never relaxes the limit. A priority passenger who appears in the future (`appear_time > t`) still doesn't get a seat now — priority is among *eligible* passengers only, because future passengers haven't been released into `ready_pri` yet. --- ### Summary of structures and complexity | Task | Key structure | Time | Space | |------|---------------|------|-------| | 1 Downsample | O–D strata + probabilistic rounding | $O(E)$ | $O(E)$ | | 2 Simulate | per-stop sorted list + cursor + ready deque, per-bus `to_stop`-bucketed riders, global occupancy counts | $O((E + B\cdot S)\log)$ | $O(E + B\cdot S)$ | | 3 Validate | `occupancy`, `current_bus`, `ever_boarded` | $O(N)$ | $O(P + B)$ | | 4 Priority | two per-stop pipelines (sorted list + cursor + ready deque), priority drained first | same as Task 2 | same | The unifying idea across all four parts: **bucket passengers by the field each operation queries** — by O–D pair for sampling, by `from_stop` for pickup, by `to_stop` for drop-off, by `passenger_id` for validation — so every hot operation is a direct lookup, and the cursor + ready-deque pattern makes pickup $O(1)$ amortized per passenger even when one stop is hammered by many arrivals. (Note: the question is framed as an AI-assisted "Karat" round, so in the live interview the value-add is articulating these invariants and edge cases clearly and validating the AI's output — exactly the kind of reasoning Task 3 rewards.)

Related Interview Questions

  • Fix the Broken Search Filter in a Book Catalog API - Instacart (medium)
  • Find Largest Adjacent Stock Price Change - Instacart (medium)
  • Implement an In-Memory File Storage System - Instacart (medium)
  • Decode an encoded string - Instacart (medium)
  • Evaluate an arithmetic expression - Instacart (medium)
|Home/Coding & Algorithms/Instacart

Implement bus route simulation features

Instacart logo
Instacart
Oct 31, 2025, 12:00 AM
mediumSoftware EngineerOnsiteCoding & Algorithms
53
0

Bus Route Simulation – Boarding, Capacity, and Priority Passes

You are given a partially implemented codebase for simulating bus routes in a city. Buses travel along fixed routes with scheduled arrival times at each stop. Passengers appear at stops and want to travel to other stops.

Each bus has:

  • A unique bus_id .
  • A fixed capacity C (maximum number of passengers allowed on board at any time).
  • A schedule : an ordered list of (time, stop_id) pairs indicating when the bus arrives at each stop.

Each passenger has:

  • A unique passenger_id .
  • appear_time : when the passenger appears at a stop.
  • from_stop : the stop where they start.
  • to_stop : the stop where they want to get off.
  • has_priority_pass : a boolean indicating whether the passenger has a priority pass.

The simulation runs in discrete time. At each scheduled bus arrival (time, stop_id) for a bus:

  1. The bus drops off all passengers currently on the bus whose to_stop equals stop_id .
  2. The bus picks up waiting passengers at stop_id whose appear_time <= time and who have not yet boarded any bus.
  3. The bus must never exceed capacity C .
  4. You must track and update the number of passengers on board after each drop-off and pick-up.

The codebase defines basic data structures like BusSchedule and PassengerEvent, but several key functions are left for you to implement.

You may assume:

  • Number of passenger events E is up to 100,000.
  • Number of buses B is up to 1,000.
  • Each bus schedule has at most 1,000 stops.
  • All times are non-negative integers and you can assume that bus schedules and passenger events are available in memory.

Task 1 – Reduce Simulation Population

The full list of passenger events may be very large, making the simulation slow. Implement a function to downsample the passenger events to reduce the number of simulated passengers while approximately preserving the demand distribution across origin–destination pairs.

API (conceptual):

List<PassengerEvent> downsamplePassengers(List<PassengerEvent> events, double factor)

where:

  • 0 < factor <= 1 is a scale factor (e.g., 0.1 means simulate ~10% of the passengers).
  • The result should:
    • Contain roughly factor * events.size() passengers.
    • Preserve the distribution of passengers across (from_stop, to_stop) pairs as much as reasonably possible.

Specify clearly how you choose which passengers to keep so that the distribution is roughly preserved.

Task 2 – Boarding and Dropping Logic with Capacity Tracking

Implement the main simulation logic that handles bus boarding and dropping, ensuring capacities are never exceeded and on-board counts are correctly tracked.

API (conceptual):

List<LogEntry> simulate(
    List<BusSchedule> busSchedules,
    List<PassengerEvent> passengers,
    int capacity
)

where each LogEntry has:

  • time
  • bus_id
  • stop_id
  • passenger_id
  • action in {"board", "drop"}

Simulation rules:

  • For each (time, stop_id) in each BusSchedule , process events in time order .
  • At each bus arrival:
    1. Drop off all passengers on that bus whose to_stop == stop_id (emit "drop" log entries).
    2. Among waiting passengers at stop_id with appear_time <= time and not yet boarded any bus, pick up passengers without ever exceeding capacity .
  • Initially, all buses have zero passengers.
  • You must maintain and update the current passenger count for each bus after each board / drop .

Your implementation should be efficient enough to handle the stated constraints.

Task 3 – Validate Logs and Find Bugs

You are given a log generated by some (possibly buggy) version of the simulation. You need to verify whether the log respects the bus capacity constraints and basic consistency rules.

API (conceptual):

boolean validateLog(List<LogEntry> log, int capacity)

The validator should check at least the following:

  • Bus occupancy for any bus never becomes negative .
  • Bus occupancy for any bus never exceeds capacity .
  • A passenger cannot drop off before boarding .
  • A passenger cannot board more than once (i.e., cannot be on multiple buses or board the same bus twice).

You may also return the first error found (e.g., by returning an error description instead of just boolean), but at minimum you must be able to detect whether the log is valid or not.

Describe the state you need to track while scanning the log and how you update it.

Task 4 – Support Priority Passengers

Extend your boarding logic in Task 2 to support priority passes:

  • Passengers with has_priority_pass == true should be boarded before non-priority passengers when a bus arrives at a stop.
  • Within the same priority level (priority vs non-priority), passengers should board in arrival order ( appear_time ascending, breaking ties consistently, e.g., by passenger_id ).
  • Capacity constraints must still be strictly enforced.

Update simulate(...) so that log entries reflect this behavior, and the on-board passenger counts remain correct.

For all tasks, clearly define and use appropriate data structures, and design your solution to run efficiently for the given input sizes.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Instacart•More Software Engineer•Instacart Software Engineer•Instacart Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.