Transform nested dicts with pandas apply/lambda

Q: Transform nested dicts with pandas apply/lambda

This is a Data Manipulation (SQL/Python) interview question from Pinterest for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

Given a pandas DataFrame df with columns: user_id (int), ts (datetime64[ns]), events (list of dicts), attrs (dict). Example rows (conceptual): user_id=1, ts=2025-08-08 09:12:00, events=[{"type":"click","ts":"2025-08-08T09:12:00"},{"type":"view","ts":"2025-08-08T09:12:05"}], attrs={"version":"1.2.0","flags":{"beta":true,"dark":false}} user_id=1, ts=2025-08-09 15:00:00, events=[{"type":"purchase","ts":"2025-08-09T15:00:00","amount":20.0}], attrs={"version":"1.2.0","flags":{"beta":true,"dark":false}} user_id=2, ts=2025-08-04 12:30:00, events=[{"type":"view","ts":"2025-08-04T12:30:00"}], attrs={"version":"1.1.0","flags":{"beta":false,"dark":true}} Tasks (write Python/pandas):

Explode events into a long table with one row per event: columns [user_id, event_type, event_ts (datetime), amount (nullable), attrs_version, attrs_beta_flag, attrs_dark_flag]. Use Series.explode and apply/lambda to parse dicts; no for-loops over DataFrame rows.
Aggregate to a per-user features table with columns: click_count, view_count, purchase_count, last_event_ts, total_purchase_amount, version_most_recent, beta_flag_most_recent, dark_flag_most_recent. Use groupby with named aggregations; where multiple versions exist, take the attrs associated with the most recent event.
From the features table, compute a conversion_rate by user segment defined by (version_most_recent, beta_flag_most_recent, dark_flag_most_recent): purchases/users. Return a compact DataFrame with one row per segment and columns [version, beta_flag, dark_flag, users, purchasers, conversion_rate].
Implement a robust helper that safely extracts nested keys from attrs using a lambda and dictionary iteration (handle missing keys and None). Explain in comments why your approach avoids SettingWithCopy pitfalls and preserves vectorization.

Transform nested dicts with pandas apply/lambda

Comments (0)