PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Data Manipulation (SQL/Python)/PayPal

Compare WHERE vs HAVING with aggregates

Last updated: Mar 29, 2026

Quick Overview

Compare WHERE vs HAVING with aggregates evaluates SQL or pandas logic, joins, grouping, window functions, null handling, edge cases, and validation in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

  • Medium
  • PayPal
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compare WHERE vs HAVING with aggregates

Company: PayPal

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

Filter groups based on an aggregate and explain WHERE vs HAVING. Provide a query that returns merchants with chargeback_rate > 0.5% in the last 30 days using HAVING. Clarify why WHERE cannot reference aggregates, how HAVING works after GROUP BY, dialect differences (e.g., MySQL allowing expressions in HAVING without GROUP BY), and performance considerations.

Quick Answer: Compare WHERE vs HAVING with aggregates evaluates SQL or pandas logic, joins, grouping, window functions, null handling, edge cases, and validation in a realistic interview setting. A strong answer states assumptions, handles edge cases, explains trade-offs, and shows how to validate the result clearly.

Solution

# Solution Alignment `WHERE` filters rows before grouping; `HAVING` filters groups after aggregate values are computed. ```sql SELECT merchant_id, COUNT(*) AS total_transactions, SUM(CASE WHEN is_chargeback THEN 1 ELSE 0 END) AS chargebacks, SUM(CASE WHEN is_chargeback THEN 1 ELSE 0 END)::numeric / NULLIF(COUNT(*), 0) AS chargeback_rate FROM transactions WHERE transaction_time >= CURRENT_DATE - INTERVAL '30 day' GROUP BY merchant_id HAVING SUM(CASE WHEN is_chargeback THEN 1 ELSE 0 END)::numeric / NULLIF(COUNT(*), 0) > 0.005 ORDER BY chargeback_rate DESC; ``` The time filter belongs in `WHERE` because it applies to raw rows. The chargeback-rate filter belongs in `HAVING` because it depends on grouped aggregates. Some dialects allow aliases in `HAVING`; repeating the expression is more portable.

Related Interview Questions

  • Write SQL using HAVING and window functions - PayPal (easy)
  • Write SQL for top drivers and cancellation rates - PayPal (easy)
  • Write SQL to flag Venmo ATO - PayPal (Medium)
  • Write conditional aggregation SQL queries - PayPal (Medium)
  • Calculate and Find Average Contacts and Sync Percentage - PayPal (Medium)
|Home/Data Manipulation (SQL/Python)/PayPal

Compare WHERE vs HAVING with aggregates

PayPal logo
PayPal
Jul 31, 2025, 12:00 AM
MediumData ScientistOnsiteData Manipulation (SQL/Python)
2
0

Compare WHERE vs HAVING with aggregates

Filter groups based on an aggregate and explain WHERE vs HAVING. Provide a query that returns merchants with chargeback_rate > 0.5% in the last 30 days using HAVING. Clarify why WHERE cannot reference aggregates, how HAVING works after GROUP BY, dialect differences (e.g., MySQL allowing expressions in HAVING without GROUP BY), and performance considerations.

Constraints & Assumptions

  • Preserve the scope, facts, inputs, and requested outputs from the prompt above.
  • If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
  • Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.

Clarifying Questions to Ask

  • Clarify SQL dialect or Python library versions, date/time semantics, duplicate handling, and null handling.
  • Define the grain of each intermediate result before aggregating.
  • State expected output columns and ordering explicitly.

What a Strong Answer Covers

  • A query or pandas plan that matches the requested output grain.
  • Correct joins, filters, grouping, window functions, and treatment of NULLs or duplicates.
  • A brief explanation of why the result is correct and how it handles edge cases.
  • Performance notes, indexes/partitioning, and validation queries when relevant.

Follow-up Questions

  • How would you test the query on a tiny hand-built dataset?
  • What changes if duplicate events or late-arriving data are present?
  • Which indexes, clustering, or partitions would help at production scale?
Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More PayPal•More Data Scientist•PayPal Data Scientist•PayPal Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)

Write your answer

Your first approved answer each day earns 20 XP.

Sign in to write your answer.
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.