PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/System Design/Amazon

Design real-time top-K products service

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in real-time stream processing, stateful aggregation and ranking, event-time windowing, and distributed systems concerns such as fault tolerance, deduplication, and scalability.

  • hard
  • Amazon
  • System Design
  • Software Engineer

Design real-time top-K products service

Company: Amazon

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Technical Screen

Design a real-time service that ingests purchase events with fields (customer_id, item_id, timestamp) and continuously outputs the top-K most purchased items over a configurable window (e.g., last 5 minutes and last 24 hours). Specify data model, ingestion pipeline, aggregation strategy (e.g., keyed windows), algorithms/data structures for maintaining top-K at scale (e.g., heaps, count–min sketch + heap), handling late/duplicate events and out-of-order timestamps, state management and TTL, fault tolerance and exactly-once semantics, horizontal scalability, and APIs for querying current top-K.

Quick Answer: This question evaluates proficiency in real-time stream processing, stateful aggregation and ranking, event-time windowing, and distributed systems concerns such as fault tolerance, deduplication, and scalability.

Related Interview Questions

  • Design a cloud database write path and recovery - Amazon (hard)
  • Design a replicated cloud storage service - Amazon (hard)
  • Measure platform success and drive adoption - Amazon (medium)
  • Design multi-tenant ingestion and processing platform - Amazon (medium)
  • Design globally consistent metadata service - Amazon (medium)
Amazon logo
Amazon
Sep 6, 2025, 12:00 AM
Software Engineer
Technical Screen
System Design
4
0

System Design: Real-Time Top‑K Purchased Items Over Rolling Windows

Design a real-time service that ingests purchase events and continuously outputs the top‑K most purchased items over configurable rolling windows (for example: last 5 minutes and last 24 hours).

Input Event

Each purchase event contains:

  • customer_id
  • item_id
  • timestamp (event time)

Assume K and window sizes are configurable at runtime.

Requirements

Specify and justify:

  1. Data model for events and results.
  2. Ingestion pipeline and partitioning.
  3. Aggregation/windowing strategy (e.g., event-time keyed windows, panes).
  4. Algorithms/data structures to maintain top‑K at scale (e.g., heaps, Space‑Saving, Count–Min Sketch + heap), including complexity and memory trade‑offs.
  5. Handling late, duplicate, and out‑of‑order events (watermarks, allowed lateness, deduplication).
  6. State management, TTL/retention, and backpressure control.
  7. Fault tolerance and exactly‑once processing semantics end‑to‑end.
  8. Horizontal scalability and hot‑key mitigation.
  9. APIs for querying the current top‑K for given windows (and update cadence/latency).

Make minimal, explicit assumptions as needed (e.g., target throughput, latency SLOs), and propose a design that works for both a short window (5m) and a long window (24h).

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Amazon•More Software Engineer•Amazon Software Engineer•Amazon System Design•Software Engineer System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.