You need to design a real-time pipeline that ingests website click events via Kafka, processes them using Apache Flink, and writes queryable aggregates to a data warehouse or lakehouse for downstream analytics.
Assume the business wants near real-time (<1 minute) aggregate metrics (e.g., page views per URL, unique users, funnels) with correctness guarantees suitable for critical decisioning. Click events are append-only and can arrive out of order.
Describe the end-to-end design, addressing:
Keep the design practical and call out trade-offs and key configuration choices.
Login required