Meta Data Engineer Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Optimize SQL to minimize scans
Given a large analytics query, refactor it to minimize table scans. 1) Replace unnecessary CTEs that cause multiple scans with inline aggregations or ...
Solve SQL and Python coding tasks
You are given a small library system with the following relational schema and several Python data-processing tasks. Answer the SQL questions and imple...
Solve library SQL and Python tasks
You are given a library domain. Assume these tables: - books(book_id, author_id, title) - authors(author_id, name) - copies(copy_id, book_id, conditio...
Find top 3 books by total borrowed time
Using copies(copy_id, book_id) and checkouts(copy_id, checkout_date, return_date), compute for each book_id the total borrowed duration as the sum ove...
Define and analyze product metrics
Product Analytics Case: Short‑Form Video Feed Context: You are evaluating a short‑form video feed feature inside a large social app where users swipe ...
Compute capacities after site closures
You are given a nested dictionary redistribution where redistribution[closed_site][dest_site] equals the additional capacity required at dest_site if ...
Design a scalable dimensional model
Design a Dimensional Model for Transactional Analytics (Concrete Example Included) You are building a star-schema in a cloud data warehouse for near r...
Validate alternating checkout/return logs
Given a chronological list of events logs of the form (timestamp, book_id, is_checkout) where is_checkout is True for a checkout and False for a retur...
Define success metrics for a social feed
Define Success Metrics for a Social Feed Feature You are evaluating a change to the main social feed in a large-scale consumer app. Assume events are ...
Find customer with max rentals in consecutive weeks
You are given a table purchases(customer_id INT, purchase_date DATE, rented_copies INT). Consider only dates in calendar year 2024. Define a full week...
Solve Python and SQL data tasks
Complete both tasks: 1) Python: Implement a function flatten(nested) that takes a list whose elements are integers or arbitrarily nested lists of inte...
Compute missing letters to form original string
Implement a function that, given two strings original and typed (typed is a misspelled/partial version of original), returns the number of additional ...
Compute reservation diff for largest member
Given copies(copy_id, reserved_by_member_id) and members(member_id, referred_by_member_id), find the member with the largest member_id. Return a singl...
Return count and renewal percentage of unreturned good copies
Tables: copies(copy_id, condition), checkouts(copy_id, checkout_date, return_date, renewal_count). Write a single SQL query that returns one row with ...
Tackle Python tasks under time pressure
In a 15-minute coding round, implement a small Python function or class to solve a well-scoped problem within about 5 minutes of coding. 1) State 1–2 ...
Design tables for event-driven metrics
Design a Relational Schema for Consumer-App Event Analytics Context and Assumptions You are designing the event store for a high-volume consumer app. ...
Write SQL for library analytics
Given a library database, write SQL to answer: 1) How many books are in good condition and not returned? Among these books, what is the percentage tha...
Solve four algorithmic library problems
Solve the following coding tasks: 1) Maximum Points from Different Categories: Given an array of items (category, points) and an integer k, choose exa...
Recommend friends-of-friends
Question Given a dictionary such as {A:[B,C], B:[C,D], C:[E]}, return for a user U all people followed by U’s followees but not already followed by U....
Aggregate Netflix metrics in SQL
Question Netflix video-streaming analytics SQL: Write a simple aggregation (e.g., total watch-time per day). Build a cumulative metric: today’s metric...