Differentiate pandas objects and SQL filters

Q: Differentiate pandas objects and SQL filters

This question evaluates understanding of pandas data structures (Series vs DataFrame), SQL filtering and aggregation semantics (WHERE vs HAVING), and the ability to construct SQL queries to identify duplicate records.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

Python (pandas)

What is the difference between a pandas Series and a pandas DataFrame ?
- Discuss structure (1D vs 2D), indexing, column labels, and common use cases.

SQL

What is the difference between WHERE and HAVING ?
- When do you use each in queries that include GROUP BY and aggregates?
Write SQL to find duplicates.

Table

Assume a table of users:

users (
- user_id BIGINT PRIMARY KEY,
- email VARCHAR,
- created_at TIMESTAMP )

Tasks

A. Return emails that appear more than once, with their duplicate count.

Output columns: email , dup_count

B. Return the full rows for users whose email is duplicated.

Output columns: user_id , email , created_at
If there are multiple users with the same email, return all of them.

Assumptions

Treat NULL emails as non-duplicates unless specified otherwise.
SQL dialect can be ANSI SQL (window functions allowed).

Differentiate pandas objects and SQL filters

Quick Overview

Python (pandas)

SQL

Table

Tasks

Assumptions

Comments (0)