PracHub
QuestionsPremiumLearningGuidesInterview PrepCoaches
|Home/Coding & Algorithms/Roblox

Implement streaming CTR with deduplication

Last updated: Mar 29, 2026

Quick Overview

This question evaluates streaming data-processing and algorithmic design skills, specifically time-windowed aggregation, click deduplication, late-event handling, state management, and selection of data structures for per-campaign CTR computation.

  • Medium
  • Roblox
  • Coding & Algorithms
  • Data Scientist

Implement streaming CTR with deduplication

Company: Roblox

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

Implement a Python function to compute streaming, per-campaign CTR over a sliding 24-hour window with click de-duplication and late-arriving events. Requirements: - Input: two iterators of dicts sorted non-decreasing by ts (ISO 8601 strings): impressions: {"imp_id", "campaign_id", "ts"} clicks: {"click_id", "imp_id", "campaign_id", "ts"} - Dedup: count at most one click per imp_id (keep earliest click). Ignore clicks whose imp_id was never seen. - Sliding window: maintain CTR for each campaign over the last 24 hours at every new event. Late events may arrive up to 10 minutes late; include them if they fall within the 24-hour window based on their ts. - Performance: O(log n) amortized updates per event, memory proportional to events within the last 24 hours. Provide big-O justification. - Output: an iterator of tuples (ts, campaign_id, impressions_in_window, dedup_clicks_in_window, ctr_in_window) emitted whenever the metric changes for that campaign. - Edge cases: multiple clicks referencing same imp_id, clock skew between streams, and daylight saving transitions. Provide a brief explanation of your data structures (e.g., heaps/queues + hash maps) and how you handle late events and expirations.

Quick Answer: This question evaluates streaming data-processing and algorithmic design skills, specifically time-windowed aggregation, click deduplication, late-event handling, state management, and selection of data structures for per-campaign CTR computation.

Related Interview Questions

  • Find Windows Containing a Target - Roblox (medium)
  • Implement Sliding-Window Rate Limiter - Roblox (medium)
  • Find target-heavy sliding windows - Roblox (medium)
  • Find most frequent call path in logs - Roblox (medium)
  • Track Highest-Earning Experience - Roblox (medium)
Roblox logo
Roblox
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Coding & Algorithms
7
0

Implement a Python function to compute streaming, per-campaign CTR over a sliding 24-hour window with click de-duplication and late-arriving events. Requirements:

  • Input: two iterators of dicts sorted non-decreasing by ts (ISO 8601 strings): impressions: {"imp_id", "campaign_id", "ts"} clicks: {"click_id", "imp_id", "campaign_id", "ts"}
  • Dedup: count at most one click per imp_id (keep earliest click). Ignore clicks whose imp_id was never seen.
  • Sliding window: maintain CTR for each campaign over the last 24 hours at every new event. Late events may arrive up to 10 minutes late; include them if they fall within the 24-hour window based on their ts.
  • Performance: O(log n) amortized updates per event, memory proportional to events within the last 24 hours. Provide big-O justification.
  • Output: an iterator of tuples (ts, campaign_id, impressions_in_window, dedup_clicks_in_window, ctr_in_window) emitted whenever the metric changes for that campaign.
  • Edge cases: multiple clicks referencing same imp_id, clock skew between streams, and daylight saving transitions. Provide a brief explanation of your data structures (e.g., heaps/queues + hash maps) and how you handle late events and expirations.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Roblox•More Data Scientist•Roblox Data Scientist•Roblox Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.