PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Data Manipulation (SQL/Python)/Upstart

Write monthly touches and last-touch SQL

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in SQL-based data manipulation and time-based attribution, covering aggregations, joins, window functions, date truncation to calendar months, tie-breaking logic, and handling edge cases like post-conversion touches and duplicate timestamps.

  • Medium
  • Upstart
  • Data Manipulation (SQL/Python)
  • Data Scientist

Write monthly touches and last-touch SQL

Company: Upstart

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

You have two tables tracking marketing touches and downstream conversions. Write SQL to answer the three prompts below. Assume a warehouse like Postgres/BigQuery/Snowflake; months are calendar months in UTC; when multiple touches tie on timestamp, break ties by the highest touch_id; ignore touches strictly after a company's conversion. Schema: - marketing_touch( touch_id BIGINT PRIMARY KEY, company_id INT NOT NULL, touch_timestamp TIMESTAMP NOT NULL, channel VARCHAR, campaign VARCHAR ) - conversion( company_id INT NOT NULL, conversion_timestamp TIMESTAMP NOT NULL ) Sample data (minimal, for reasoning/testing): marketing_touch | touch_id | company_id | touch_timestamp | channel | campaign | | 1 | 100 | 2025-01-05 10:00:00 | Email | E1 | | 2 | 100 | 2025-01-20 12:00:00 | Paid | P1 | | 3 | 100 | 2025-02-01 09:00:00 | Direct | D1 | | 4 | 101 | 2025-02-10 08:00:00 | Paid | P2 | | 5 | 101 | 2025-02-10 08:00:00 | Email | E2 | | 6 | 102 | 2025-02-28 23:59:59 | Referral | R1 | conversion | company_id | conversion_timestamp | | 100 | 2025-02-10 00:00:00 | | 101 | 2025-02-10 08:00:00 | | 103 | 2025-03-01 12:00:00 | Prompts: 1) For each month (YYYY-MM), output month and avg_touches_per_company = total touches in that month divided by the number of distinct companies that had at least one touch in that same month. Include months present in data only. Be explicit about handling companies with zero touches in a month (exclude them from the denominator). 2) For each company that converted (exists in conversion), return the last touch at or before its conversion_timestamp: company_id, conversion_timestamp, last_touch_timestamp, channel, campaign, and whether the last touch occurred in the same calendar month as the conversion. If a company has no touch at or before conversion, exclude it. 3) Count distinct companies where the last touch (as defined in #2) and the conversion occur in the same calendar month. Bonus: add a second output where you also require DATEDIFF in days between conversion_timestamp and last_touch_timestamp <= 45. Edge cases to handle in your SQL: (a) multiple touches at the exact same timestamp for a company (pick the one with the highest touch_id); (b) touches after conversion (ignore for #2/#3); (c) companies present in conversion with no prior touches (exclude in #2/#3). Provide performant SQL (CTEs are fine) and briefly explain your tie-break and month-extraction logic.

Quick Answer: This question evaluates proficiency in SQL-based data manipulation and time-based attribution, covering aggregations, joins, window functions, date truncation to calendar months, tie-breaking logic, and handling edge cases like post-conversion touches and duplicate timestamps.

Related Interview Questions

  • Manipulate data in R with dplyr joins and windows - Upstart (Medium)
  • Solve SQL CTR and Python analytics tasks - Upstart (Medium)
  • Calculate Average Event Value by User ID - Upstart (Medium)
  • Calculate User Revenue and Session Duration in Python - Upstart (Medium)
Upstart logo
Upstart
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
5
0

You have two tables tracking marketing touches and downstream conversions. Write SQL to answer the three prompts below. Assume a warehouse like Postgres/BigQuery/Snowflake; months are calendar months in UTC; when multiple touches tie on timestamp, break ties by the highest touch_id; ignore touches strictly after a company's conversion.

Schema:

  • marketing_touch( touch_id BIGINT PRIMARY KEY, company_id INT NOT NULL, touch_timestamp TIMESTAMP NOT NULL, channel VARCHAR, campaign VARCHAR )
  • conversion( company_id INT NOT NULL, conversion_timestamp TIMESTAMP NOT NULL )

Sample data (minimal, for reasoning/testing): marketing_touch | touch_id | company_id | touch_timestamp | channel | campaign | | 1 | 100 | 2025-01-05 10:00:00 | Email | E1 | | 2 | 100 | 2025-01-20 12:00:00 | Paid | P1 | | 3 | 100 | 2025-02-01 09:00:00 | Direct | D1 | | 4 | 101 | 2025-02-10 08:00:00 | Paid | P2 | | 5 | 101 | 2025-02-10 08:00:00 | Email | E2 | | 6 | 102 | 2025-02-28 23:59:59 | Referral | R1 | conversion | company_id | conversion_timestamp | | 100 | 2025-02-10 00:00:00 | | 101 | 2025-02-10 08:00:00 | | 103 | 2025-03-01 12:00:00 |

Prompts:

  1. For each month (YYYY-MM), output month and avg_touches_per_company = total touches in that month divided by the number of distinct companies that had at least one touch in that same month. Include months present in data only. Be explicit about handling companies with zero touches in a month (exclude them from the denominator).
  2. For each company that converted (exists in conversion), return the last touch at or before its conversion_timestamp: company_id, conversion_timestamp, last_touch_timestamp, channel, campaign, and whether the last touch occurred in the same calendar month as the conversion. If a company has no touch at or before conversion, exclude it.
  3. Count distinct companies where the last touch (as defined in #2) and the conversion occur in the same calendar month. Bonus: add a second output where you also require DATEDIFF in days between conversion_timestamp and last_touch_timestamp <= 45.

Edge cases to handle in your SQL: (a) multiple touches at the exact same timestamp for a company (pick the one with the highest touch_id); (b) touches after conversion (ignore for #2/#3); (c) companies present in conversion with no prior touches (exclude in #2/#3). Provide performant SQL (CTEs are fine) and briefly explain your tie-break and month-extraction logic.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Upstart•More Data Scientist•Upstart Data Scientist•Upstart Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.