Write complex SQL for cohorts and retention
Company: TikTok
Role: Software Engineer
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Given tables users(uid, created_at, country), orders(order_id, uid, amount, created_at, status), and events(uid, ts, event_type, campaign), write one SQL query that outputs, for each month and country:
(
1) new users,
(
2) conversion rate within 7 days of signup,
(
3) 7-day rolling retention (active on day D and again on D+
7),
(
4) GMV excluding refunded/canceled orders, and
(
5) top campaign by last-touch attribution from events. Handle late-arriving events, deduplicate by (uid, ts, event_type), ensure timezone consistency, and document window-function choices.
Quick Answer: This question evaluates advanced SQL and data-engineering competencies such as cohort and retention analysis, deduplication, timezone normalization, last-touch attribution, aggregate revenue calculations, and effective use of window functions in the Data Manipulation (SQL/Python) domain.