PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/HubSpot

Find spammer with minimal hasMessaged calls

Last updated: Jun 15, 2026

Quick Overview

Find the "spammer" account that messaged every other account but received from none, using only a hasMessaged(a, b) oracle. This is the classic find-the-celebrity problem inverted, solved with a two-phase elimination-then-verify algorithm in O(n) oracle calls and O(1) extra space, with full correctness proof and edge-case handling.

  • Medium
  • HubSpot
  • Coding & Algorithms
  • Software Engineer

Find spammer with minimal hasMessaged calls

Company: HubSpot

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Onsite

##### Question You have `n` accounts labeled `0..n-1` and an oracle API: ``` boolean hasMessaged(a, b) // true iff account a has sent at least one message to account b ``` Define a **spammer** as an account that has sent a message to *every* other account and has received messages from *no* other account. At most one such account can exist. You may only call `hasMessaged` — you have no direct access to the underlying message graph. 1. **Core task.** Implement `findSpammer(n)` that returns the spammer's index if one exists, otherwise returns `-1`. 2. **Minimize oracle calls.** Achieve **O(n)** calls to `hasMessaged` and **O(1)** extra space. State and justify the worst-case number of oracle calls. 3. **Correctness.** Explain why the algorithm is correct — in particular, why the elimination phase cannot discard the true spammer, and why the verification phase is necessary. 4. **Complexity.** Analyze time complexity (in oracle calls) and extra-space complexity. 5. **Edge cases.** Handle `n` in `{0, 1}`, the possibility of multiple candidates surviving elimination, and self-messages (`hasMessaged(i, i)`).

Quick Answer: Find the "spammer" account that messaged every other account but received from none, using only a hasMessaged(a, b) oracle. This is the classic find-the-celebrity problem inverted, solved with a two-phase elimination-then-verify algorithm in O(n) oracle calls and O(1) extra space, with full correctness proof and edge-case handling.

Solution

### Recognizing the problem This is the classic **celebrity problem** with the relation inverted. A celebrity is known by everyone and knows no one; here a **spammer** *sends to* everyone and *receives from* no one. The roles of "out-edge" and "in-edge" swap, but the two-phase **eliminate-then-verify** technique transfers directly. Think of the accounts as nodes in a directed graph where there is an edge $a \to b$ iff `hasMessaged(a, b)` is `true`. A spammer is a node with an out-edge to *every* other node and *zero* in-edges. We never see the graph; we can only probe one edge per oracle call. --- ### The key elimination invariant The whole approach rests on one observation: **a single oracle call eliminates exactly one candidate.** For any pair $a \neq b$, call `hasMessaged(a, b)`: - **Returns `true`:** $b$ received a message, so $b$ has an in-edge $\Rightarrow$ **$b$ cannot be the spammer** (spammers receive from no one). - **Returns `false`:** $a$ failed to message $b$, so $a$ is missing an out-edge $\Rightarrow$ **$a$ cannot be the spammer** (spammers message everyone). Either way, one call rules out exactly one of the two accounts — and crucially, the account it rules out is *never* the true spammer. The account it keeps may or may not be the spammer; we only know it *survived this particular test*. This is why $n - 1$ calls suffice to narrow $n$ candidates down to a single survivor: each call permanently discards one non-spammer. --- ### Algorithm: two phases **Phase 1 — Elimination (`n − 1` calls).** Keep a single `candidate`, starting at `0`. For each `i = 1 .. n-1`, call `hasMessaged(candidate, i)`: - If `true`: `i` received a message → `i` is out. Keep `candidate`. - If `false`: `candidate` failed to message `i` → `candidate` is out. Set `candidate = i`. After the sweep, exactly one account survives. By the invariant, every account *other than* the final `candidate` was eliminated by some call, and elimination only ever discards non-spammers. So **if a spammer exists, it is the surviving `candidate`.** **Phase 2 — Verification (≤ `2(n − 1)` calls).** Phase 1 always produces a survivor, *even when no spammer exists* — it only proves "no one else can be the spammer," not "this one is." So we confirm the survivor `c` directly against the full definition. For every other `j`: - **Sent to everyone:** `hasMessaged(c, j)` must be `true`. - **Received from no one:** `hasMessaged(j, c)` must be `false`. If any check fails, no spammer exists → return `-1`. If all pass, `c` is the spammer. Self-messages (`hasMessaged(i, i)`) are never called: the definition concerns *other* accounts, so we always skip the index equal to the candidate. --- ### Reference implementation (Python) ```python def findSpammer(n): # --- Edge cases --- if n <= 0: return -1 if n == 1: # Vacuously messaged "every other account" and received from none. # If the convention requires n >= 2, return -1 instead — clarify with # the interviewer. return 0 # --- Phase 1: elimination -> single surviving candidate --- candidate = 0 for i in range(1, n): if hasMessaged(candidate, i): pass # i received a message -> i is out else: candidate = i # candidate missed someone -> candidate is out # --- Phase 2: verify the survivor against everyone else --- for j in range(n): if j == candidate: continue # never probe self-messages if not hasMessaged(candidate, j): return -1 # candidate failed to message someone if hasMessaged(j, candidate): return -1 # candidate received a message return candidate ``` ### Reference implementation (Java) ```java int findSpammer(int n) { if (n <= 0) return -1; if (n == 1) return 0; // vacuous spammer; clarify convention int candidate = 0; for (int i = 1; i < n; i++) { if (!hasMessaged(candidate, i)) { candidate = i; // candidate missed i -> candidate is out } // else: i received a message -> i is out, keep candidate } for (int j = 0; j < n; j++) { if (j == candidate) continue; if (!hasMessaged(candidate, j)) return -1; // missed someone if (hasMessaged(j, candidate)) return -1; // received from someone } return candidate; } ``` --- ### Correctness **1. Elimination is sound — the true spammer survives Phase 1.** Let $s$ be the spammer (if one exists). $s$ can be discarded only in two ways during the sweep, and neither can happen: - $s$ is dropped as `candidate` when `hasMessaged(s, i) == false` for some `i`. Impossible: $s$ messaged *every* other account, so this call is always `true`. - $s$ is dropped as `i` when `hasMessaged(candidate, s) == true`. Impossible: $s$ received from *no one*, so this call is always `false`. Therefore $s$ is never eliminated. Since every account except the final `candidate` *was* eliminated, the survivor must be $s$. (Equivalently: any account that is not the survivor was ruled out by a concrete failing test, so it provably isn't a spammer.) **2. Verification is necessary.** Phase 1 emits a candidate unconditionally, including when the graph contains no spammer at all (e.g. everyone messaged everyone, or no one messaged anyone). Surviving elimination only means "no *other* account can be the spammer." It does **not** prove the survivor satisfies the definition. Phase 2 tests every required edge — all $n-1$ out-edges and all $n-1$ in-edges of the candidate — so passing is *exactly* the definition. This makes the algorithm complete: it returns a real spammer when one exists and `-1` otherwise. **3. Uniqueness — one candidate is enough.** At most one spammer can exist. If $x$ and $y$ were both spammers, $x$ must have messaged $y$ (spammers message everyone), so $y$ *received* a message — contradicting $y$ being a spammer. Hence verifying the single survivor is sufficient; we never need to test more than one candidate. --- ### Complexity | Phase | Oracle calls (worst case) | |-------|---------------------------| | Elimination | $n - 1$ | | Verification | $2(n - 1)$ | | **Total** | $3(n - 1) = 3n - 3 = O(n)$ | - **Oracle calls:** $3n - 3$ worst case — clearly $O(n)$, as required. - **Time:** $O(n)$ (each phase is a single linear pass). - **Extra space:** $O(1)$ — only the integer `candidate` and loop counters; no per-account bookkeeping. **Is $O(n)$ optimal?** It is optimal up to a constant factor. Any correct algorithm must make at least $n - 1$ calls: each account must be touched by at least one call to be ruled in or out, and there is no way to learn about an account without an oracle call involving it. So the lower bound is $\Omega(n)$ and our $3n - 3$ matches it asymptotically. (By contrast, the brute-force "check every ordered pair" approach costs $\Theta(n^2)$ calls.) The exact optimum for this celebrity-style problem is known to be tighter than $3n - 3$ — it is $3n - \lfloor \log_2 n \rfloor - 3$ calls — but closing that logarithmic gap does not change the $O(n)$ result, so the simple bound is the one worth presenting. **Optional micro-optimization — shaving Phase-2 calls.** Phase 1 establishes one fact about the *final* candidate $c$ that Phase 2 can safely reuse: - Every index $i$ probed *after* $c$ became the active candidate was kept by a `hasMessaged(c, i) == true` result — that is precisely how $c$ stayed the candidate through step $i$. Those `true` results are exactly the "sent to everyone" checks Phase 2 would repeat for those same indices, so for each such $i$ the forward probe `hasMessaged(c, j == i)` is redundant and can be skipped. - The other direction is *not* reusable: Phase 1 never tested any `hasMessaged(j, c)` (the "received from no one" checks), and Phase-1 results recorded while a *different* account was the active candidate tell us nothing about $c$. So every in-edge check in Phase 2 still has to run. In practice this only shaves the forward-direction probes for indices that followed $c$ in the scan, and the constant rarely matters for an $O(n)$ result, so the simple two-independent-passes version (`3n − 3`) is the one worth presenting; mention the reuse idea only if the interviewer pushes on tightening the constant. --- ### Edge cases - **`n == 0`:** no accounts exist → return `-1`. (Phase 1's `candidate = 0` would be invalid, so this guard must come first.) - **`n == 1`:** a lone account vacuously "messaged every *other* account" and "received from *no* other account," so it qualifies → return `0`. If the interviewer's convention requires at least one *other* account to message, return `-1` instead. This is a definition question worth surfacing explicitly rather than guessing. - **No spammer exists:** Phase 1 still yields a survivor; Phase 2 fails at least one check and we return `-1`. Covers "everyone messaged everyone," "no one messaged anyone," and every graph in between. - **Multiple survivors:** can't happen. Phase 1 maintains exactly one `candidate` throughout, and uniqueness guarantees at most one real spammer — so a single verification pass is always sufficient. - **Self-messages (`hasMessaged(i, i)`):** never probed. We skip `j == candidate` in Phase 2 and never compare an account to itself in Phase 1. A self-message must neither qualify nor disqualify a candidate, since the definition is about *other* accounts only.

Explanation

Two-phase elimination-then-verify, identical in structure to the celebrity problem with the messaging direction inverted. Phase 1 uses the invariant that any single hasMessaged(a,b) call eliminates exactly one candidate, reducing n accounts to one survivor in n-1 calls. Phase 2 verifies the survivor sent to everyone and received from no one. The rubric rewards: recognizing the celebrity-problem reduction, achieving O(n) calls / O(1) space, proving elimination soundness and verification necessity, and handling n in {0,1}, uniqueness, and self-messages.

Related Interview Questions

  • Validate hiring request under role constraints - HubSpot (medium)
  • Find a special person using knows(a,b) - HubSpot (easy)
  • Design and implement a bank account system - HubSpot (Medium)
  • Design file deduplication at scale - HubSpot (Medium)
  • Design a bank with scheduled payments and merges - HubSpot (Medium)
HubSpot logo
HubSpot
Aug 14, 2025, 12:00 AM
Software Engineer
Onsite
Coding & Algorithms
13
0
Question

You have n accounts labeled 0..n-1 and an oracle API:

boolean hasMessaged(a, b)   // true iff account a has sent at least one message to account b

Define a spammer as an account that has sent a message to every other account and has received messages from no other account. At most one such account can exist.

You may only call hasMessaged — you have no direct access to the underlying message graph.

  1. Core task. Implement findSpammer(n) that returns the spammer's index if one exists, otherwise returns -1 .
  2. Minimize oracle calls. Achieve O(n) calls to hasMessaged and O(1) extra space. State and justify the worst-case number of oracle calls.
  3. Correctness. Explain why the algorithm is correct — in particular, why the elimination phase cannot discard the true spammer, and why the verification phase is necessary.
  4. Complexity. Analyze time complexity (in oracle calls) and extra-space complexity.
  5. Edge cases. Handle n in {0, 1} , the possibility of multiple candidates surviving elimination, and self-messages ( hasMessaged(i, i) ).

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More HubSpot•More Software Engineer•HubSpot Software Engineer•HubSpot Coding & Algorithms•Software Engineer Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.