Write conditional aggregates with CASE WHEN
Write a query that produces conditional aggregates using CASE WHEN (e.g., counts of approved vs declined transactions per merchant and the sum of amounts flagged for review). Explain why CASE WHEN is the portable approach across SQL dialects compared with dialect-specific boolean-to-integer coercion (e.g., SUM(column = 'x')). Discuss readability and maintainability trade-offs.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify SQL dialect or Python library versions, date/time semantics, duplicate handling, and null handling.
-
Define the grain of each intermediate result before aggregating.
-
State expected output columns and ordering explicitly.
What a Strong Answer Covers
-
A query or pandas plan that matches the requested output grain.
-
Correct joins, filters, grouping, window functions, and treatment of NULLs or duplicates.
-
A brief explanation of why the result is correct and how it handles edge cases.
-
Performance notes, indexes/partitioning, and validation queries when relevant.
Follow-up Questions
-
How would you test the query on a tiny hand-built dataset?
-
What changes if duplicate events or late-arriving data are present?
-
Which indexes, clustering, or partitions would help at production scale?