Explain Pandas and SQL Basics

Q: Explain Pandas and SQL Basics

This question evaluates understanding of pandas and SQL fundamentals—specifically pandas Series vs DataFrame distinctions, SQL WHERE vs HAVING semantics, and practical duplicate-detection in transactional data—targeting data manipulation competency for Data Engineer roles.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

You are interviewing for a Data Engineer internship. Answer the following short data-manipulation questions:

In pandas, what is the difference between a Series and a DataFrame ? Compare their dimensionality, indexing behavior, and common use cases.
In SQL, what is the difference between WHERE and HAVING ? Explain when each filter is applied and whether aggregate expressions can be used.
Consider a table transactions_raw with the following schema:
- ingest_id BIGINT
- account_id BIGINT
- transaction_id BIGINT
- amount DECIMAL(12,2)
- transaction_ts TIMESTAMP
Assume transaction_ts is stored in UTC, and there are no foreign-key relationships relevant to this task. Define a duplicate as multiple rows with the same (account_id, transaction_id, amount, transaction_ts) . Write a SQL query that returns all duplicate groups with the output columns: account_id, transaction_id, amount, transaction_ts, duplicate_count .

Explain Pandas and SQL Basics

Quick Overview

Comments (0)