Task: Quickly Understand an Undocumented Events+Purchases CSV and Produce Executive-Ready Visuals
Context
You are handed a single CSV that mixes user events and purchases for a launch week. The schema is undocumented. Columns:
-
order_id, user_id, event_ts (UTC), merchant_id, session_id, event_type (view/add_to_cart/purchase), amount_usd, device_type, country
Your goal in ~30 minutes is to: sanity-check the data, outline the key analyses, and design concise, decision-oriented visuals.
Requirements
-
Data understanding: List the first 10 checks you run (e.g., nulls, duplicates, timestamp monotonicity by session, timezone sanity, categorical cardinality, outliers, unit consistency, referential integrity between event_type = "purchase" and amount_usd, weekend/weekday patterns, country/device coverage).
-
Visual plan: Propose 3–5 specific charts (titles, axes, grain) to answer “What is happening?” and “So what?”. Justify each choice and expected insight.
-
Granularity: Choose daily vs hourly aggregation for a launch week; defend trade-offs and how you’d switch with a parameter.
-
Data quality: Show how one bad clock-skew day would appear in your visuals and how you’d annotate/adjust.
-
Deliverable: Describe a one-slide dashboard wireframe (sections, KPIs, filters) and how you’d validate it with a stakeholder in a 5‑minute readout.