Problem
Design a query system that allows internal consumers (e.g., customer support, risk/fraud, data analysts) to retrieve a given user's search activity and order activity.
The query must support filtering by:
-
Time
: a time range (start/end)
-
Geography
: a location constraint (e.g., bounding box or radius around a point)
Data to Query
Assume two event types:
-
Search events
: (user_id, timestamp, query_text, location_lat/lng, device/session metadata)
-
Order events
: (user_id, order_id, timestamp, items/amount, shipping/delivery location, payment metadata)
Required Capabilities
-
Given
(user_id, time_range, geo_filter)
, return matching search events and order events.
-
Sort results by timestamp, and paginate.
-
Support querying either event type separately or both together.
Non-Functional Requirements (state assumptions)
-
Low latency interactive queries (e.g., p95 < 1–2s for typical ranges like last 7–30 days).
-
Handle large volume of events (search events are high-cardinality and high-write; orders are lower volume but strongly consistent in the source of truth).
-
Data is privacy-sensitive; enforce access controls and auditing.
Deliverables
Describe:
-
APIs (request/response)
-
Data model and indexing strategy for time + geo queries
-
Storage choices and overall architecture (ingestion, serving, retention)
-
Scalability, consistency, and operational considerations
-
Key trade-offs and edge cases