Pandas Data Manipulation

What's being tested

Pandas data manipulation here means turning messy local CSVs into trustworthy analytical tables: read, normalize, join, aggregate, pivot, and validate results. Interviewers are probing whether you can prevent silent analytical errors, especially duplicate joins, null mishandling, date parsing mistakes, and brittle preprocessing code.

Patterns & templates

Robust CSV ingestion with `pd.read_csv()`: set dtype, parse_dates, encoding, sep, usecols; inspect shape, head(), isna().mean() before merging.
Safe joins with `df.merge(..., how=..., on=..., validate=...)`: use validate='one_to_one', 'many_to_one', or 'one_to_many' to catch row explosions.
Metric aggregation via `groupby().agg()`: compute revenue, impressions, clicks, or conversions at the correct grain before calculating ratios like CTR = clicks / impressions.
Pivoted reporting with `pivot_table(index=..., columns=..., values=..., aggfunc='sum', fill_value=0)`: confirm totals match the pre-pivot `groupby` output.
Date handling with `pd.to_datetime(errors='coerce')`, .dt.to_period(), `sort_values()`, and `ffill()`; always define calendar gaps versus true missing observations.
Null and coalescing logic using `fillna()`, `combine_first()`, `where()`, and `np.select()`; distinguish missing data from valid zeros in metrics.
Code robustness: factor transformations into pure functions, add input validation, unit-test edge cases with `pytest`, and document expected schema and complexity.

Common pitfalls

Pitfall: Merging raw fact tables before aggregation can create duplicate rows and inflate revenue, clicks, or impressions.

Pitfall: Computing CTR as the mean of row-level rates instead of sum(clicks) / sum(impressions) gives the wrong weighted metric.

Pitfall: Treating every missing date as needing `ffill()` can invent activity; clarify whether absence means zero, unknown, or carry-forward state.

Practice these

The practice cards below cover the canonical variants — solve all of them and time yourself.

What's being tested

Patterns & templates

Common pitfalls

Practice these

Featured in interview prep guides

Practice questions

Related concepts