PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/System Design/Bloomberg

Design streaming mention analytics with search and alerts

Last updated: May 18, 2026

Quick Overview

This question evaluates system-design competencies such as real-time stream ingestion, stateful stream processing and time-windowed aggregation, high-throughput search indexing, alerting and subscription mechanisms, storage and retention strategy, and trade-offs around scalability, latency, consistency, and cost.

  • hard
  • Bloomberg
  • System Design
  • Software Engineer

Design streaming mention analytics with search and alerts

Company: Bloomberg

Role: Software Engineer

Category: System Design

Difficulty: hard

Interview Round: Onsite

## Scenario You ingest a real-time external stream of social-media posts and news articles. Each item contains raw text and metadata (timestamp, source, author/site, etc.). The product tracks companies/stocks ("entities") and shows: 1. **Mention analytics**: how many times each entity was mentioned over time (similar to impressions/mentions). 2. **Charts by time window**: users can choose time spans from **30 minutes to multiple days** (tumbling or sliding windows are both acceptable). 3. **Latency**: charts may be delayed by **10–30 minutes**, but data must be aggregated before display. 4. **Subscriptions & notifications**: users can follow a set of entities, filter analytics to followed entities, and configure alerts (e.g., spike in mentions). 5. **Search**: - Users can search across **hundreds of thousands of entities** (by company/stock name). - Users can also search for the **underlying documents** (posts/articles) that mention entities. - Search supports **any number of keywords** and filtering (e.g., entity, time range, source). - Search load can be very high (e.g., **~100k RPS**). 6. **Spiky traffic**: must handle extreme bursts (breaking news, meme-stock events). 7. **Storage choices**: decide how to store both **processed/aggregated** data and **raw documents**. ## Task Design a high-level system (APIs, data flow, storage, and scaling strategy) that satisfies the above requirements. Clearly explain: - How raw streaming data is ingested, processed, and aggregated. - How time-windowed analytics are computed and served. - How document/entity search works at high QPS. - How subscriptions and alerting are implemented. - How the system remains reliable and cost-effective under spikes. State assumptions and key trade-offs (e.g., consistency, latency, storage format, retention).

Quick Answer: This question evaluates system-design competencies such as real-time stream ingestion, stateful stream processing and time-windowed aggregation, high-throughput search indexing, alerting and subscription mechanisms, storage and retention strategy, and trade-offs around scalability, latency, consistency, and cost.

Related Interview Questions

  • Design a Global Marketing Email Platform - Bloomberg (medium)
  • Design a fair event registration queue API - Bloomberg (medium)
  • Design in-memory trade subscription processor - Bloomberg (medium)
  • Explain Kafka partitions and delivery semantics - Bloomberg (hard)
  • Design a packet reassembler API - Bloomberg (medium)
Bloomberg logo
Bloomberg
Mar 1, 2026, 12:00 AM
Software Engineer
Onsite
System Design
29
0
Loading...

Scenario

You ingest a real-time external stream of social-media posts and news articles. Each item contains raw text and metadata (timestamp, source, author/site, etc.). The product tracks companies/stocks ("entities") and shows:

  1. Mention analytics : how many times each entity was mentioned over time (similar to impressions/mentions).
  2. Charts by time window : users can choose time spans from 30 minutes to multiple days (tumbling or sliding windows are both acceptable).
  3. Latency : charts may be delayed by 10–30 minutes , but data must be aggregated before display.
  4. Subscriptions & notifications : users can follow a set of entities, filter analytics to followed entities, and configure alerts (e.g., spike in mentions).
  5. Search :
    • Users can search across hundreds of thousands of entities (by company/stock name).
    • Users can also search for the underlying documents (posts/articles) that mention entities.
    • Search supports any number of keywords and filtering (e.g., entity, time range, source).
    • Search load can be very high (e.g., ~100k RPS ).
  6. Spiky traffic : must handle extreme bursts (breaking news, meme-stock events).
  7. Storage choices : decide how to store both processed/aggregated data and raw documents .

Task

Design a high-level system (APIs, data flow, storage, and scaling strategy) that satisfies the above requirements. Clearly explain:

  • How raw streaming data is ingested, processed, and aggregated.
  • How time-windowed analytics are computed and served.
  • How document/entity search works at high QPS.
  • How subscriptions and alerting are implemented.
  • How the system remains reliable and cost-effective under spikes.

State assumptions and key trade-offs (e.g., consistency, latency, storage format, retention).

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More System Design•More Bloomberg•More Software Engineer•Bloomberg Software Engineer•Bloomberg System Design•Software Engineer System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.