Create OHLC Aggregates from Tick Data in Python
Company: Robinhood
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Onsite
price_stream
+-----------+-------+
| timestamp | price |
+-----------+-------+
| 0 | 3 |
| 1 | 2 |
| 2 | 4 |
| 3 | 10 |
| 8 | 11 |
+-----------+-------+
##### Scenario
Streaming trading application receives tick data as "price:timestamp" pairs. You must generate 10-second OHLC aggregates and forward-fill gaps.
##### Question
Write a Python function that
parses an input string like "3:0,2:1,4:2,10:3,10:4,10:5,10:6,10:7,10:8,10:9,10:10,11:8";
buckets rows into [0-
10), [10-
20)… intervals by timestamp;
for every interval outputs first_price, last_price, min_price, max_price;
if an interval has no rows, copy the previous interval’s last_price into every statistic for the missing bucket.
##### Hints
Use floor(timestamp/
10) to find a bucket, keep running dict {bucket: [first,last,min,max]}, track last seen price for forward-fill.
Quick Answer: This question evaluates a candidate's ability to perform time-series bucketing and OHLC aggregation from tick data, testing skills in data parsing, stateful aggregation, and handling missing-interval forward-filling; it falls under Data Manipulation (SQL/Python) and assesses practical application rather than purely conceptual understanding.