Design a satellite propagation simulator
Company: Optiver
Role: Software Engineer
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Take-home Project
##### Question
Implement a `SatelliteNetwork` class that simulates message propagation over an **undirected** satellite graph. The class consumes a stream of instructions online and triggers callbacks as the simulation unfolds. Implement the following methods:
1. **`satellite_connected(satellite_id)`** — Register a new satellite. If `satellite_id` already exists, call `ErrDuplicateSatelliteId(satellite_id)` and do not add it again.
2. **`relationship_established(satellite_id1, satellite_id2)`** — Create a bidirectional (undirected) link between the two satellites. If either ID does not exist, skip the instruction and call `ErrInvalidSatelliteId(offending_id)` for the invalid ID.
3. **`message_received(satellite_ids)`** — Earth sends the same message simultaneously at time `t = 0` to all listed satellites. If any listed ID does not exist, call `ErrInvalidSatelliteId(offending_id)`. Treat each call to `message_received` as an independent run (reset all per-message state between runs).
**Propagation rules**
- When a satellite first receives the message at time `t`, it attempts to forward it to each directly connected neighbor that has not yet received it. Forwarding is done one neighbor at a time, in increasing `SatelliteId` order, and each send takes exactly **10 seconds** (so the first neighbor is reached at `t + 10`, the second at `t + 20`, and so on).
- A satellite never attempts to notify the satellite that notified it.
- Different satellites may concurrently attempt to notify the same neighbor; each attempt still consumes 10 seconds, but as soon as **any** attempt completes, that neighbor is considered notified and no further attempts to it should proceed (the neighbor takes its earliest arrival time).
- A satellite ignores all receptions after its first.
**Reporting back to Earth**
- A satellite becomes eligible to report back once **all** of its direct neighbors have received the message (from any sender). At the earliest time this condition becomes true, the satellite spends **30 seconds** processing and then reports back — i.e. its report is delivered at `(time the last neighbor was notified) + 30`.
- For each report, call `OnSatelliteReportedBack(satellite_id)` in exact chronological order for that message.
- When multiple reports are delivered at the same time, invoke the callbacks in increasing `SatelliteId` order.
- Do not return or print anything else.
**Assumptions / constraints**
- Satellite IDs are integers.
- There may be multiple calls to `message_received`; each is an independent run.
- Handle duplicate and invalid references exactly as specified via the error callbacks.
- Design your data structures and algorithm to handle large `N` satellites and `M` links efficiently, while guaranteeing deterministic output ordering.
Provide code or pseudocode and analyze the time and space complexity.
Quick Answer: Optiver software-engineer take-home: implement a SatelliteNetwork that simulates timed message propagation over an undirected graph and fires OnSatelliteReportedBack callbacks in deterministic order. It tests graph modeling, discrete-event simulation, priority-queue scheduling, and careful handling of 10s-per-hop forwarding and 30s report timing with tie-breaking by SatelliteId.
Solution
## Approach
This is an **event-driven (discrete-event) simulation** over an undirected graph. The key insight is that "each send takes 10 seconds, in increasing neighbor-ID order" means a satellite that first receives the message at time `t` produces a *sequence* of timed forwarding events to its not-yet-notified neighbors, and the first attempt to reach any neighbor wins. We process timed events in nondecreasing time order, breaking ties by `SatelliteId`, with a priority queue.
### Data structures
- `adj: dict[int, list[int]]` — adjacency list. Keep each neighbor list **sorted** (or sort on demand) so forwarding happens in increasing `SatelliteId` order.
- `exists: set[int]` — registered satellite IDs (for duplicate / invalid checks).
- Per-run state (reset on every `message_received` call):
- `notified_at: dict[int, int]` — the time each satellite was first notified (its arrival time).
- `notifier: dict[int, int]` — who first notified each satellite, so it never forwards back to that node.
- `pending_neighbors: dict[int, int]` — count of a satellite's neighbors not yet notified; when it hits 0 the satellite becomes report-eligible.
- `reported: set[int]` — satellites that have already reported.
- A min-heap `pq` of events keyed by `(time, type_priority, satellite_id, ...)` so that ties resolve deterministically by time then ID.
### Algorithm
**Registration / links (online):**
```
satellite_connected(s):
if s in exists: ErrDuplicateSatelliteId(s); return
exists.add(s); adj[s] = []
relationship_established(a, b):
if a not in exists: ErrInvalidSatelliteId(a); return
if b not in exists: ErrInvalidSatelliteId(b); return
insert b into adj[a] (keep sorted); insert a into adj[b] (keep sorted)
```
**Simulation (`message_received(ids)`):**
1. Validate every id; for any missing one call `ErrInvalidSatelliteId(id)`. Reset all per-run state. Initialize `pending_neighbors[s] = len(adj[s])` for every satellite.
2. Push a NOTIFY event `(t=0, s)` for each valid seed id (Earth notifies them simultaneously at t=0, with no notifier).
3. Pop events in `(time, id)` order:
- **NOTIFY `(t, s)` arriving from `from_id`:** if `s` was already notified (`s in notified_at`), ignore it (first reception wins). Otherwise set `notified_at[s] = t`, `notifier[s] = from_id`. Then `s` forwards: iterate its sorted neighbors, skipping `from_id`; for the k-th *eligible (not-yet-notified at scheduling time)* neighbor `nb`, push NOTIFY `(t + 10*k, nb)` from `s` (k = 1, 2, ...). It is fine to schedule sends to neighbors that later get notified by someone else — the duplicate is dropped on arrival because of the "already notified" check.
- **Decrement-pending bookkeeping:** whenever any satellite `s` becomes notified at time `t`, for each neighbor `nb` of `s`, decrement `pending_neighbors[nb]`. If `pending_neighbors[nb]` reaches 0, neighbor `nb` is now report-eligible at time `t`; push a REPORT event `(t + 30, nb)`.
- **REPORT `(rt, s)`:** if not already reported, mark reported and call `OnSatelliteReportedBack(s)`.
4. Because the heap pops in `(time, id)` order, callbacks at equal times fire in increasing `SatelliteId` order automatically.
### Edge cases
- An isolated satellite (degree 0) that receives the message has all (zero) neighbors notified immediately, so it reports at `t + 30`.
- A satellite not reachable from any seed is never notified and never reports.
- Duplicate `satellite_connected` and invalid `relationship_established` / `message_received` references are handled via the error callbacks without mutating state.
- The first attempt to reach a neighbor wins; later scheduled attempts are discarded on arrival, which naturally models "as soon as any attempt completes, no further attempts proceed."
### Reference pseudocode (core loop)
```
import heapq
def message_received(self, ids):
for i in ids:
if i not in self.exists: self.ErrInvalidSatelliteId(i)
seeds = [i for i in ids if i in self.exists]
notified_at, notifier, reported = {}, {}, set()
pending = {s: len(self.adj[s]) for s in self.exists}
pq = []
for s in seeds:
heapq.heappush(pq, (0, s, None)) # (time, target, from_id)
while pq:
t, s, frm = heapq.heappop(pq)
if s in notified_at: # ignore receptions after the first
continue
notified_at[s] = t; notifier[s] = frm
# bookkeeping: s is now notified -> update each neighbor's pending count
for nb in self.adj[s]:
pending[nb] -= 1
if pending[nb] == 0:
heapq.heappush(pq, (t + 30, nb, 'REPORT'))
# forward to not-yet-notified neighbors (skip the notifier), 10s each, in ID order
k = 0
for nb in self.adj[s]: # adj[s] kept sorted
if nb == frm or nb in notified_at: continue
k += 1
heapq.heappush(pq, (t + 10 * k, nb, s))
# REPORT events are interleaved by time; deliver in chronological then ID order
```
*(In a clean implementation, NOTIFY and REPORT events are distinct event types in the same heap, ordered by `(time, kind, id)`; the snippet above inlines the pending-count bookkeeping for clarity.)*
### Complexity
- Let `N` = number of satellites and `M` = number of links. Each satellite is notified at most once and each edge generates O(1) scheduled events, so the heap holds O(N + M) events.
- Time: **O((N + M) log(N + M))** dominated by heap operations; sorting adjacency lists is O(M log(max-degree)) amortized.
- Space: **O(N + M)** for the graph plus the event queue and per-run maps.
Determinism comes entirely from the `(time, id)` ordering in the priority queue and from forwarding in increasing `SatelliteId` order.
Explanation
Model it as a discrete-event simulation on an undirected graph driven by a min-heap of timed NOTIFY and REPORT events. First reception wins (later attempts to an already-notified node are dropped), forwarding is 10s per neighbor in increasing-ID order skipping the notifier, and a node reports 30s after its last neighbor becomes notified. Ordering ties by (time, SatelliteId) makes the callback order deterministic. Overall O((N+M) log(N+M)) time, O(N+M) space.