Design a Real-Time Top-K Ranking System
Company: Uber
Role: Software Engineer
Category: Software Engineering Fundamentals
Difficulty: hard
Interview Round: Onsite
Design an object-oriented, real-time **Top-K ranking system**.
The system continuously receives score updates for a large set of entities — for example users, drivers, restaurants, or products. Each entity has a unique ID and a single numeric score. Your design should expose a clean object-oriented API and support efficient insertion, update, removal, and top-K retrieval as scores change over time.
The system must support the following operations:
1. `update(entity_id, new_score)` — Insert a new entity, or update the score of an existing entity.
2. `top_k(k)` — Return the `k` entities with the highest scores, ordered from highest to lowest. Ties must be broken deterministically — for example, by `entity_id` ascending after sorting by score descending.
3. `remove(entity_id)` — Remove an entity from the ranking.
### Constraints & Assumptions
- Entity IDs are unique; each entity has exactly one current score.
- Scores are numeric and may be updated arbitrarily often; assume an entity's score can go up or down.
- `top_k` is expected to be called frequently and should be fast relative to the total number of entities.
- The number of entities can be large (assume it does not all fit conveniently in a single linear scan per query), but it fits in memory on a single node for the core design.
- Tie-breaking must be deterministic and stable across calls.
- Treat the core design as in-memory and single-node first; concurrency and batch ingestion are addressed in later parts.
### Clarifying Questions to Ask
- What is the approximate scale — number of entities, update rate (writes/sec), and `top_k` query rate (reads/sec)? Is this read-heavy or write-heavy?
- What is the typical and maximum value of `k`? Is `k` small (e.g. a leaderboard top 10) or can it approach the total entity count?
- For `remove` and `top_k(k)` with `k` larger than the population, what is the API contract — error, no-op, or return whatever exists?
- Are scores integers or floating point, can they be negative, and is there a defined behavior for duplicate scores beyond the stated tie-break?
- Must reads reflect the very latest write (strong consistency), or is slightly stale top-K acceptable?
- Is this strictly in-memory for one process, or must rankings survive a restart / scale beyond one machine?
### What a Strong Answer Covers
This is a design and coding signal, not a single-algorithm puzzle. A strong answer demonstrates:
- A clear object-oriented API with the three operations and well-defined contracts for edge cases.
- A justified choice of **two cooperating data structures** — one for direct lookup, one for ordered retrieval — and why neither alone suffices.
- Correct, deterministic tie-breaking baked into the ordering key.
- Time and space complexity for each operation, and the trade-offs between candidate structures (balanced tree vs. heap vs. sorted array).
- How the update path keeps both structures consistent when an entity's score changes.
- A concurrency story that starts simple and scales (see the parts below).
- A batch-ingestion story that isolates bad records and stays observable.
- A concrete, organized test plan covering correctness, edge cases, and concurrency.
---
### Part 1 — Core data structures and operations
Propose the in-memory data structures and define the three operations (in code or precise pseudocode). Discuss the time and space complexity of `update`, `top_k`, and `remove`, and explain how you keep the structures consistent when an existing entity's score changes.
```hint Where to start
No single structure does both jobs well. You need **fast lookup by `entity_id`** ("does it exist, what's its current score?") *and* **fast retrieval in score order**. Think about pairing two complementary structures.
```
```hint Data structures
Consider a **hash map** (`entity_id -> score`) for O(1) lookup alongside an **ordered structure** (balanced BST / `TreeSet` / `std::set`, or a heap) keyed by score. For top-K specifically, weigh a fully ordered set against a size-bounded structure.
```
```hint Tie-breaking & the update path
The ordering key needs to encode **both** the score *and* the tie-breaker so that no two entries compare equal — what does sorting on score alone leave ambiguous? Then walk the update path carefully: when an existing entity's score changes, what does the ordered structure need to know *before* it can locate the stale entry, and where does that information live?
```
### Part 2 — Concurrency
Explain how the design changes if multiple threads call `update`, `remove`, and `top_k` concurrently. Describe how you keep the two structures mutually consistent under concurrent access, and how your approach scales as write throughput grows.
```hint Where to start
The two structures must be updated **atomically together** — a reader must never see the hash map and the ordered set disagree. Start by reasoning about the simplest correct synchronization.
```
```hint Scaling beyond one lock
A single read/write lock is simple and fine when reads dominate, but it serializes writes. To raise write throughput, think about **partitioning** entities across independent units and how you'd reassemble a global top-K from per-partition results.
```
### Part 3 — Large input batches with errors
Explain how to handle very large input batches in which some records may be malformed or fail during processing. The goal is to ingest the good records without letting a few bad ones fail the whole batch.
```hint
Don't make one bad record abort the batch. Think about **per-record validation**, routing failures somewhere (e.g. a dead-letter queue / error log with a reason), **idempotency** for safe retries, and per-batch **metrics** (processed / succeeded / failed). What anchors ordering if stale updates arrive out of order?
```
### Part 4 — Testing
Describe the tests you would write for correctness, edge cases, and concurrency.
```hint
Enumerate edge cases explicitly: empty ranking, `top_k(0)`, `top_k(k)` with `k` larger than the population, raising vs. lowering an existing score, duplicate scores (verify the deterministic tie-break), removing a non-existent entity, and negative/zero/large scores. For concurrency, think about how you'd assert correctness under interleaved readers and writers (e.g. invariant checks after a randomized concurrent workload).
```
---
### Follow-up Questions
- How does the design change if `k` can be as large as the entire entity set, versus a fixed small leaderboard (e.g. top 10)? Which data structure choice wins in each regime?
- What breaks first as write throughput climbs by 100x, and how does sharding change the cost and accuracy of a global `top_k`?
- If rankings must survive a process restart or scale across multiple machines, what would you add (persistence, partitioning, a coordinator), and what new consistency problems appear?
- How would you support a "time-windowed" top-K (e.g. highest scores in the last hour) where old contributions must expire?
Quick Answer: This question evaluates a candidate's ability to design efficient in-memory data structures and APIs for a real-time Top-K ranking service, address concurrency and fault-tolerant batch processing, and outline tests for correctness and edge cases.