SQL Product Analytics
Asked of: Data Scientist
Last updated

What's being tested
These problems test SQL product analytics: turning raw event tables into trustworthy metrics such as conversion, retention, CTR, frequency, revenue, and cohort activity. Interviewers are probing whether you can reason about event attribution, deduplication, time windows, and safe aggregation without double-counting users, impressions, clicks, or sessions.
Patterns & templates
-
Time-based attribution joins — join events where
conversion_ts >= assignment_tsand within a defined window; prefer earliest valid assignment usingROW_NUMBER. -
Deduplication before aggregation — use
ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY event_ts)orCOUNT(DISTINCT ...)when raw logs contain repeats. -
Cohort retention template — define cohort date with
MIN(activity_date), join future activity, then computeD1,D7, orD30retention by cohort. -
Session-level aggregation — group impressions, ads, clicks, or calls by
user_id, session_id; compute per-session metrics before rolling up to users. -
Null-safe metric formulas — use
COALESCE,NULLIF, andCASE WHEN;CTR = clicks / NULLIF(impressions, 0)avoids divide-by-zero failures. -
Window functions for ranking and percentiles — use
ROW_NUMBER,LAG,LEAD,PERCENTILE_CONT, orNTILEfor recency, sequencing, and distribution questions. -
Geo/date dimensional joins — join user or event dimensions carefully; decide whether geography and date are attributed at event time, user profile time, or UTC date.
Common pitfalls
Pitfall: Joining impressions to clicks only on
user_idcreates many-to-many inflation; includeimpression_id,ad_id, or a strict time window.
Pitfall: Computing retention from all active users instead of the original cohort changes the denominator and overstates product health.
Pitfall: Filtering soft-deleted or invalid records after aggregation can leave deleted events inside counts; filter eligible rows in the base CTE.
Practice these
The practice cards below cover the canonical variants — solve all of them and time yourself.
Practice questions
- Compute ad revenue metrics by geography in SQLMeta · Data Scientist · Technical Screen · medium
- Compute cohort GMV and payer rate with edge casesMeta · Data Scientist · Technical Screen · Medium
- Build DiD dataset with SQLMeta · Data Scientist · Technical Screen · Medium
- Write SQL for visibility, calls, and cohort activityMeta · Data Scientist · HR Screen · Medium
- Write SQL for retention, conversion, and churnMeta · Data Scientist · Onsite · Medium
- Produce dating profile funnel report by cohortMeta · Data Scientist · Onsite · Medium
- Write SQL to analyze Group Calls adoptionMeta · Data Scientist · Technical Screen · Medium
- Write SQL for comment analyticsMeta · Data Scientist · Onsite · Medium
- Compute feed ad frequency and retention in SQLMeta · Data Scientist · Onsite · Medium
- Join datasets and compute conversion by assignmentMeta · Data Scientist · Technical Screen · Medium
Related concepts
- SQL Analytical QueryingData Manipulation (SQL/Python)
- SQL AnalyticsData Manipulation (SQL/Python)
- SQL Analytics Joins, Aggregations, And Windows
- SQL Window Functions And Analytical QueryingData Manipulation (SQL/Python)
- SQL And Python Data ManipulationData Manipulation (SQL/Python)
- SQL Analytics And Event Data ManipulationData Manipulation (SQL/Python)