SQL Window Functions And Temporal Joins
Asked of: Data Scientist
Last updated

What's being tested
Ability to transform raw event-level data into user/session-level signals using SQL or pandas: ordering events, deduplicating rows, joining facts across time, and computing metrics. PayPal-style prompts often test whether you can detect fraud, page sequences, overlapping sessions, or contact-sync adoption without procedural row-by-row logic.
Patterns & templates
-
Last/first event per entity —
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_ts DESC, event_id DESC); always add deterministic tie-breakers. -
Adjacent-event reasoning — use
LAG,LEAD, or ordered self-joins to identify sequences like page A → page B → page C. -
Temporal joins — join on entity plus time predicates:
a.user_id = b.user_id AND b.ts BETWEEN a.ts AND a.ts + INTERVAL '24 hours'. -
Interval overlap counting — two sessions overlap when
s1.start_ts < s2.end_ts AND s2.start_ts < s1.end_ts; avoid double-counting self-pairs. -
Dedup before aggregation — apply
ROW_NUMBERorCOUNT(DISTINCT ...)before user-level metrics; raw event tables often contain retries or repeated actions. -
Conditional aggregation — use
SUM(CASE WHEN condition THEN 1 ELSE 0 END)andAVG(CASE WHEN flag THEN 1.0 ELSE 0 END)for rates. -
Python equivalent —
sort_values,groupby,shift,merge,rolling, and boolean masks mirror SQL windows and temporal filters.
Common pitfalls
Pitfall: Filtering on window-function aliases in the same
SELECT; wrap the window calculation in a CTE or subquery.
Pitfall: Treating timestamps as unordered strings or ignoring timezone consistency when comparing login, transaction, and session events.
Pitfall: Using
INNER JOINwhen the metric denominator requires all users; default toLEFT JOINfor adoption, sync, or exposure-rate calculations.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Featured in interview prep guides
Practice questions
- Write SQL for top drivers and cancellation ratesPayPal · Data Scientist · Onsite · easy
- Compute top orders and cancellation ratePayPal · Data Scientist · Onsite · easy
- Write SQL to flag Venmo ATOPayPal · Data Scientist · Technical Screen · Medium
- Calculate and Find Average Contacts and Sync PercentagePayPal · Data Scientist · Technical Screen · Medium
- Identify Users with Specific Page Navigation PatternsPayPal · Data Scientist · Onsite · Medium
- Explain Window Functions and Joins in SQL and PythonPayPal · Data Scientist · Onsite · Medium
- Identify Session with Maximum Overlapping Sessions CountPayPal · Data Scientist · Onsite · Medium
- Identify Users with Specific Page Visit SequencePayPal · Data Scientist · Onsite · Medium
Related concepts
- SQL Window Functions And AnalyticsData Manipulation (SQL/Python)
- SQL Analytics Joins, Aggregations, And Windows
- SQL Analytical QueryingData Manipulation (SQL/Python)
- SQL Window Functions And Analytical QueryingData Manipulation (SQL/Python)
- SQL AnalyticsData Manipulation (SQL/Python)
- SQL Analytical Querying And Data ModelingData Manipulation (SQL/Python)