Meta Data Engineer Data Manipulation (SQL/Python) Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Optimize SQL to minimize scans
Given a large analytics query, refactor it to minimize table scans. 1) Replace unnecessary CTEs that cause multiple scans with inline aggregations or ...
Solve library SQL and Python tasks
You are given a library domain. Assume these tables: - books(book_id, author_id, title) - authors(author_id, name) - copies(copy_id, book_id, conditio...
Find customer with max rentals in consecutive weeks
You are given a table purchases(customer_id INT, purchase_date DATE, rented_copies INT). Consider only dates in calendar year 2024. Define a full week...
Solve Python and SQL data tasks
Complete both tasks: 1) Python: Implement a function flatten(nested) that takes a list whose elements are integers or arbitrarily nested lists of inte...
Compute reservation diff for largest member
Given copies(copy_id, reserved_by_member_id) and members(member_id, referred_by_member_id), find the member with the largest member_id. Return a singl...
Return count and renewal percentage of unreturned good copies
Tables: copies(copy_id, condition), checkouts(copy_id, checkout_date, return_date, renewal_count). Write a single SQL query that returns one row with ...
Tackle Python tasks under time pressure
In a 15-minute coding round, implement a small Python function or class to solve a well-scoped problem within about 5 minutes of coding. 1) State 1–2 ...
Find top 3 books by total borrowed time
Using copies(copy_id, book_id) and checkouts(copy_id, checkout_date, return_date), compute for each book_id the total borrowed duration as the sum ove...
Write SQL for library analytics
Given a library database, write SQL to answer: 1) How many books are in good condition and not returned? Among these books, what is the percentage tha...
Aggregate Netflix metrics in SQL
Question Netflix video-streaming analytics SQL: Write a simple aggregation (e.g., total watch-time per day). Build a cumulative metric: today’s metric...
Query carpool ride metrics
Question For a ride-sharing product with carpool capability, answer a series of SQL questions (e.g., daily completed pooled rides, average seats utili...
Count active follow connections
Question Write SQL to return the current number of active follow connections. Events table columns: user_id, target_id, event_type ('request_follow', ...
Write SQL for active follow connections
Table: follow_events(requester_id INT, target_id INT, event STRING CHECK (event IN ('request_follow','follow_success','follow_reject','unfollow')), ev...
Recommend two-hop follows in Python
Given a directed "follows" graph as a Python dict[str, list[str]], implement recommend_two_hop(graph, user) that returns the set (or a sorted list) of...
Return top-3 content per category
Given a collection of items with fields (content_id, category, rating), implement top_k_by_category(items, k= 3) that returns, for each category, the ...
Compute cumulative metrics with full joins
Tables: - daily_metrics(date DATE, content_id STRING, daily_value BIGINT) - cumulative_metrics(date DATE, content_id STRING, cumulative_value BIGINT) ...
Write SQL for library analytics
Given a library database, write SQL to answer the following: 1) Count the number of books that are currently not returned (i.e., still checked out) an...
Write SQL and Python for data prep
Given clickstream events (user_id, event_type, ts, properties) and a users table (user_id, signup_ts, plan), write SQL to compute DAU/WAU/MAU, D1/W1 r...