Understanding a New Dataset: Profiling, Data Quality, and Visualization Plan
Scenario
You receive a new, unfamiliar dataset and must quickly generate insights and visualizations for business stakeholders on a tight timeline.
Task
Walk through the end-to-end steps you would take to:
-
Understand what the dataset looks like (shape, schema, granularity, and contents).
-
Assess data quality (completeness, correctness, consistency, and potential biases).
-
Decide which visualizations to build to communicate key findings aligned with business goals.
Your answer should cover:
-
Data profiling and schema understanding.
-
Missing-value checks and other quality checks (duplicates, ranges, keys, outliers).
-
Univariate and multivariate exploratory data analysis (EDA).
-
Choosing chart types based on variable types (numeric, categorical, time) and business questions.
-
Prioritizing visuals for a stakeholder readout and noting caveats.
-
If applicable, guardrails for experimentation data (randomization checks, sample ratio mismatch, event ordering).