PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/LinkedIn

Design Top K ranking system

Last updated: Apr 8, 2026

Quick Overview

This question evaluates understanding of real-time stream processing, distributed systems scalability, and frequency-estimation data structures for continuously computing Top-K under high throughput, sliding windows, grouping keys, and bounded staleness.

  • hard
  • LinkedIn
  • System Design
  • Software Engineer

Design Top K ranking system

Company: LinkedIn

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

##### Question Design a system that can continuously identify the Top K elements from a large or streaming dataset. Discuss data structures, scalability considerations, update handling, and how to support high-throughput queries.

Quick Answer: This question evaluates understanding of real-time stream processing, distributed systems scalability, and frequency-estimation data structures for continuously computing Top-K under high throughput, sliding windows, grouping keys, and bounded staleness.

Related Interview Questions

  • Review a Web Application Architecture - LinkedIn (easy)
  • Scale a Distributed Randomized Multiset - LinkedIn (medium)
  • Design a Top-K Ranking Service - LinkedIn (easy)
  • Design a Global Calendar Service - LinkedIn (medium)
  • Design a malicious-URL checking service using an isMalicious API - LinkedIn (medium)
LinkedIn logo
LinkedIn
Jul 29, 2025, 8:05 AM
Software Engineer
Onsite
System Design
13
0

System Design: Real-time Top-K from a Large/Streaming Dataset

Context

You receive a continuous, high-volume stream of events, each referencing an item (e.g., item_id). The system must continuously identify the Top K most frequent items and serve low-latency queries. Assume:

  • Data volume is large (potentially millions of events per second), item cardinality can be high, and K is small (e.g., 10–1,000).
  • Queries may request Top K for different time windows (e.g., last 1 minute, 1 hour, 1 day) and potentially by a grouping key (e.g., per region, per tenant).
  • Results should be near-real-time with bounded staleness.

Requirements

Design a system that:

  1. Ingests a large/streaming dataset and continuously identifies the Top K elements.
  2. Chooses suitable data structures for exact and approximate solutions.
  3. Scales horizontally across shards/partitions.
  4. Handles updates: insertions, deletions/expirations (e.g., sliding windows), out-of-order/late events.
  5. Supports high-throughput, low-latency queries (read path), including caching/materialization.
  6. Discusses consistency, fault tolerance, and operational considerations.

Provide the design, trade-offs, and key algorithms/data structures. Include complexity and accuracy considerations.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More LinkedIn•More Software Engineer•LinkedIn Software Engineer•LinkedIn System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.