PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates a candidate's ability to perform data manipulation in SQL and pandas, covering event deduplication, time-window aggregations, next-day retention calculations, top-N ranking, joins, and data cleaning/mapping for analytics.

  • Medium
  • Pinterest
  • Data Manipulation (SQL/Python)
  • Data Scientist

Write SQL and pandas for shopping events

Company: Pinterest

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Use the schema and sample data below to answer SQL and pandas tasks. Treat 'today' as 2025-09-01. Schema users(user_id INT, country STRING) pins(pin_id INT, creator_user_id INT, is_shopping_enabled BOOLEAN, created_at DATE) events(event_id INT, user_id INT, pin_id INT, event_type STRING, event_ts TIMESTAMP, stay_time_sec INT, feature STRING, product_category_code STRING) product_categories(code STRING, name STRING) Sample tables (small) users user_id | country 1 | US 2 | US 3 | CA pins pin_id | creator_user_id | is_shopping_enabled | created_at 10 | 3 | true | 2025-08-20 11 | 2 | false | 2025-08-22 12 | 1 | true | 2025-08-25 events event_id | user_id | pin_id | event_type | event_ts | stay_time_sec | feature | product_category_code 101 | 1 | 10 | shopping_click | 2025-08-30 10:00:00 | 35 | shopping_module | A 102 | 1 | 10 | shopping_click | 2025-09-01 09:00:00 | 50 | shopping_module | null 103 | 2 | 12 | shopping_click | 2025-08-31 12:00:00 | null | shopping_module | H 104 | 2 | 11 | view | 2025-09-01 08:00:00 | null | feed | null 105 | 3 | 10 | shopping_click | 2025-08-28 07:00:00 | 15 | shopping_module | A 106 | 3 | 10 | shopping_click | 2025-09-01 11:59:00 | -5 | shopping_module | ? product_categories code | name A | Apparel H | Home ? | Unknown Tasks SQL 1) Daily shopping engagement last 7 days: Write a single SQL query that returns, for each date d in [2025-08-26, 2025-09-01], the columns (d, dau_shopping, clicks, avg_stay_time_pos_sec, rolling_7d_uniq_users). Count only events where event_type='shopping_click' and feature='shopping_module'. Deduplicate rapid repeat clicks per (user_id, pin_id) if they occur within 5 minutes (treat as one); implement dedup with window functions. Exclude stay_time_sec <= 0 or NULL from the average, but still count those clicks in 'clicks'. Ensure dates with no activity appear with zeros using a generated dates CTE. 2) Next-day retention: Among users with at least one shopping_click on 2025-08-31, compute the percentage that also have a shopping_click on 2025-09-01. 3) Top pins per user: For the 7 days ending 2025-09-01, return for each user their top 2 pins by number of deduplicated shopping_clicks; break ties by greater total positive stay_time_sec, then by smallest pin_id. Python (pandas) Given a DataFrame events_df with the same columns as events: (a) Map product_category_code using dict = {'A':'Apparel','H':'Home','?':'Unknown'} so missing/unknown codes become 'Unknown'. (b) Replace negative stay_time_sec with NaN; fill remaining NaN stay_time_sec with 0 for aggregation but exclude zeros from averages where appropriate. (c) Sort events_df by ['user_id' asc, 'event_ts' desc, 'stay_time_sec' desc]. (d) Compute, for the last 7 days ending 2025-09-01, each user's top category by total positive stay_time_sec and return a Series user_id -> top_category (tie-breaker: alphabetical).

Quick Answer: This question evaluates a candidate's ability to perform data manipulation in SQL and pandas, covering event deduplication, time-window aggregations, next-day retention calculations, top-N ranking, joins, and data cleaning/mapping for analytics.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Write Queries for Pinterest Engagement Tasks - Pinterest (medium)
  • Write SQL for top categories and highly active users - Pinterest (easy)
  • Compute percent of first-cancelled users who never rebook - Pinterest (medium)
  • Compute percent of first-cancel users who never return - Pinterest (easy)
  • Find top category by video time spent - Pinterest (Medium)