Exclude free subscribers via anti-join
Company: Intuit
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: Medium
Interview Round: Technical Screen
Extend your previous monthly new-subscriber + YoY query to exclude companies that ever received a free subscription, using an anti-join against Free_Subs. Ensure excluded companies are removed from all months and all derived YoY metrics. Your solution should be resilient to duplicates in Free_Subs and should not accidentally re-include rows due to joins. Also, briefly explain in comments the trade-offs between using NOT EXISTS versus LEFT JOIN ... IS NULL for this anti-join under typical OLAP columnar engines.
Additional table and sample data:
Free_Subs
+------------+
| company_id |
+------------+
| 2 |
| 5 |
+------------+
Requirements:
- Do not change the output column names/types from the first question.
- The anti-join must occur before aggregation so excluded IDs do not affect zero-filled months or YoY baselines.
- Guard against duplicate company_id values in Free_Subs (e.g., use DISTINCT or NOT EXISTS).
Quick Answer: This question evaluates competency with SQL anti-joins, join semantics, deduplication, and correct pre-aggregation filtering to prevent excluded entities from affecting aggregated monthly and year-over-year metrics; it is categorized as Data Manipulation (SQL/Python) for a Data Scientist role.