Data Engineer Interview Questions
Practice the exact questions companies are asking right now.
Solve SQL and Python coding tasks
You are given a small library system with the following relational schema and several Python data-processing tasks. Answer the SQL questions and imple...
Evaluate impact of short videos in feed
Scenario You work on a social app’s main News Feed. The team wants to introduce a short-form video module ("Reels") into the feed. Prompt 1. How would...
Write SQL for car rental utilization by city
SQL / Data Query Prompt (Car Rental) You are given four tables: user - user_id location - location_id - city car - car_id - car_size (e.g., compact, m...
Model entities for feed content and shares
Scenario You are designing the data model for a social app’s News Feed that shows multiple content types (text, image, short video). Users can interac...
Write SQL for session analytics
You are given two tables for an e-commerce product. Tables 1) shops A shop dimension table that contains duplicate rows for the same shop_id. Columns:...
Design batch and streaming ETL architecture
System Design: End-to-End Data Platform for Product Analytics (Batch + Near-Real-Time) Context Design a scalable data platform for a large consumer pr...
Implement most_frequent_key without using max()
Problem (Python OOP) You are given two classes. Parent precomputes frequency counts of items (as strings) from an input list. `python class Parent: ...
Compute capacities after site closures
You are given a nested dictionary redistribution where redistribution[closed_site][dest_site] equals the additional capacity required at dest_site if ...
Tackle Python tasks under time pressure
In a 15-minute coding round, implement a small Python function or class to solve a well-scoped problem within about 5 minutes of coding. 1) State 1–2 ...
Solve four algorithmic library problems
Solve the following coding tasks: 1) Maximum Points from Different Categories: Given an array of items (category, points) and an integer k, choose exa...
Recommend two-hop follows in Python
Given a directed "follows" graph as a Python dict[str, list[str]], implement recommend_two_hop(graph, user) that returns the set (or a sorted list) of...
Compute missing letters to form original string
Implement a function that, given two strings original and typed (typed is a misspelled/partial version of original), returns the number of additional ...
Check carpool trip feasibility
You are given a list of trips where each trip i is (passengers_i, start_i, end_i) with start_i < end_i on a one-dimensional route. A single vehicle wi...
Validate carpool capacity
Question LeetCode 1094. Car Pooling – Given trips[i] = [numPassengers, start, end] and an integer capacity, return true if the vehicle can fulfill all...
Solve library coding tasks in Python
Implement the following Python tasks: 1) Given a list of (category, points) for books, choose up to 3 books with all different categories to maximize ...
Analyze private-account product metrics
Private Accounts Feature: Product Understanding, Data Model, Metrics, and Root-Cause Analysis Context You are working on a social-network feature that...
Define and analyze product metrics
Product Analytics Case: Short‑Form Video Feed Context: You are evaluating a short‑form video feed feature inside a large social app where users swipe ...
Validate alternating checkout/return logs
Given a chronological list of events logs of the form (timestamp, book_id, is_checkout) where is_checkout is True for a checkout and False for a retur...
Visualize Netflix metric trends
Visualizing a Streaming Metric for Netflix Prompt Choose one streaming metric (for example, Daily Active Viewers or Average Watch Duration) and descri...
Define success metrics for a social feed
Define Success Metrics for a Social Feed Feature You are evaluating a change to the main social feed in a large-scale consumer app. Assume events are ...