Google Data Manipulation (SQL/Python) Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Calculate User Deviation from Team Average Messages
usage_stats +---------+---------+---------------+------------+ | user_id | team_id | messages_sent | date | +---------+---------+---------------...
Analyze User Flags and Review Outcomes for Moderation Prioritization
UserFlags +---------------+--------------+----------+---------+ | User_FirstName| User_LastName| Video_ID | Flag_ID | +---------------+--------------+...
Sample and Simulate Price Adjustments in R with dplyr
Products +----+-----------+-------+ | id | product | price | | 1 | phone | 500 | | 2 | tablet | 300 | | 3 | laptop | 1000 | | 4 |...
Design Scalable Database and Analyze E-commerce Data
transactions +-----------+----------+------------+------------+ | user_id | order_id | product_id | order_time | +-----------+----------+-----------...
Generate binomial matrix and column-normalize
Using Python with NumPy, generate a 100×100 matrix of Binomial(n = 10, p = 0.3) draws with a fixed random seed, then normalize each column so it sums ...
Analyze video flags and reviews with SQL
You are designing SQL queries for YouTube Trust & Safety. Use the schema and sample data below. Unless stated otherwise, treat a flag as reviewed if t...
Write SQL/Python for messy event data
Using the schema and sample data below, write: (1) a single SQL query to compute daily metrics for the local date 2025-09-01 in America/Los_Angeles, a...
Find most co‑purchased product pairs in SQL
Given the schema and sample data below, write ANSI-SQL to return the top 5 unordered product pairs most frequently purchased together across distinct ...
Design a scalable video platform database
Design the relational database for a YouTube-like video company. Deliverables: 1) list the core tables with key columns, types, and constraints (users...
Deduplicate events and rank products with SQL
You are given two tables. Schema: - events(event_id INT PRIMARY KEY, user_id INT, product_id INT, event_time TIMESTAMP, idempotency_key TEXT, amount_c...
Implement R dplyr simulation and left join
Using R and dplyr, run a simulation and a join. Data: prices item_id | price_usd 1 | 10.00 2 | 20.00 3 | 30.00 4 | 40.00 catalog item_id | category 1 ...
Compute violation rate and flag precision in SQL
You are analyzing a Trust & Safety product in BigQuery. Assume 'today' is 2025-09-01 (UTC). Define precise metrics and write SQL to compute them, bein...
Add a conditional column in Python
Using pandas, add a derived column to a table based on multiple conditions with strict precedence and missing-value handling. Given the sample DataFra...
Compute monthly CRR with merges and gaps
You are given PostgreSQL tables user_profile(user_id, signup_ts, country, is_employee, is_test), user_events(user_id, event_ts, event_type, revenue, p...
Implement a robust Python generator
Given a list of integers, write a Python generator that yields the integers from the list while handling edge cases such as None values, empty input, ...
Calculate Top Countries' Gmail Usage and MoM Change
emails +----+---------+-----------+-----------+------------+ | id | user_id | country | provider | send_date | +----+---------+-----------+-------...