Software Engineer Data Manipulation (SQL/Python) Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Implement and debug event filtering in Python
You are given a list of event dictionaries with keys: id (str), type (str), ts (int, seconds since epoch), payload (dict). Implement filter_events(eve...
Find returning users from access logs
Given a large user access log, parse it and identify which user_ids are returning customers—i.e., they have at least one visit on two or more distinct...
Print the K-th non-empty line
Given a large UTF-8 text file, write a program that prints the K-th non-empty line. Do not load the whole file into memory. Specify how you handle fil...
Set up a Python interview environment
You can use AI coding tools. Prepare a clean laptop for a Python-based onsite and explain your steps: ( 1) Install pyenv and set up a project-specific...
Implement Spring MVC to find top-enrolled course
Implement a Spring MVC service that returns the course with the highest number of enrolled students from a relational database pre-populated by provid...
Evaluate SQL expressions for zero
Which of the following SQL expressions evaluate to 0? Evaluate each independently, justify the result, and note any dialect-specific behavior (e.g., M...
Compute unique visitors per department from clicks
Given tables Products(product_id, department, category, subcategory) where department > category > subcategory form a hierarchy, and ClickLog(user_id,...
Implement filters and cursor pagination
Design and implement a transaction query module over a dataset or database where each transaction has startDate, endDate, userId, and amount. Requirem...
Pivot transactions by date without date libs
Given a stream of transaction rows (shopper_id, date_str, amount) where date_str is ISO format 'YYYY-MM-DD', produce a pivoted report for a specified ...
Generate user notifications from schedules
Given a schedule template format and user profile data (timezone, locale, delivery preferences), implement a program that generates a user notificatio...
Compute most popular location with weights
You are given a dataset of voting records for concert locations. Each record includes voter_id, location_text, and an optional numeric weight (default...
Calculate cost from orders with SQL
You have two tables: orders(order_id INT, user_id INT, order_date DATE, quantity INT, unit_price DECIMAL, coupon_code VARCHAR) and coupons(code VARCHA...
Use Excel formulas to compute haircuts
You are given an Excel worksheet with columns such as Asset, Market Value, and Haircut %. Using only cell references (no copy-paste of numeric values)...
Compute dasher pay from deliveries
Given a list of delivery events for dashers (e.g., dasherId, pickupTime, dropoffTime, distance, tip, and optional bonuses) and a set of pay rules (e.g...
Identify false MySQL foreign key statement
MySQL: Which of the following statements is false? A) A column might have a foreign key reference to itself. B) MySQL supports foreign key references ...
Implement scalable word count locally
Write a function that reads a very large text file and outputs the frequency of each word. Define your tokenization and normalization rules (case fold...
Implement a nested object validator
Implement a helper function validate(object, required_style) that checks whether a possibly nested object matches a provided schema. The object may co...
Parse and build binary data in Python
Using provided interfaces ByteReader(read(n), read_uint32_le, read_string) and ByteWriter(write(b), write_uint32_le, write_string), implement function...
Review a geospatial Python module
You receive a Python module that processes geospatial datasets (CSV/GeoJSON) to compute distances, cluster nearby points, and write summaries. Perform...
Compute costs with validation and sorting in Python
Implement a three-part Python task to compute costs for purchase line items. Part 1: Write compute_cost(line_items, price_db) where line_items is a li...