Google Data Scientist Coding & Algorithms Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Implement Sampling and Minimize Loss in Numerical Coding
Scenario Numerical coding challenges on sampling and loss minimization. Question a) Implement functions to sample from truncated normal distributions ...
Remove Duplicates While Preserving Order in List
Scenario A data pipeline receives an unordered list of IDs containing duplicates; downstream components require a duplicate-free list while preserving...
Normalize Columns in Binomial Matrix Efficiently
Scenario Write code that creates a 100×100 matrix of Binomial(1, 0. 5) samples and normalizes each column so it sums to 1. Question Provide an efficie...
Determine If Two Strings Are Anagrams Efficiently
Scenario Backend service needs to verify whether two user-provided strings are anagrams for text-matching features. Question Implement a Python functi...
Implement percentage RMSE and bootstrap its CI
Given a CSV with columns [country, actual_revenue, predicted_revenue], define percentage RMSE as pRMSE = sqrt(mean_i((pred_i/actual_i − 1)^2)). a) Imp...
Compute precision–recall curve on imbalanced data
You receive a CSV with columns: actual_label ∈ {0,1} and predicted_prob ∈ [0,1]; the positive class rate is ≈5%. a) Which evaluation metrics would you...
Minimize L2, L1, and quantile losses
Given an array X of n real numbers, derive the value θ that minimizes the sum of squared deviations Σ(xi−θ)² (mean) and the sum of absolute deviations...
Implement longest subarray summing to k
Given an integer array nums (length ≤ 200,000; values may be negative) and integer k, return the maximum length and the [l, r] indices of a contiguous...
Simulate Coin Flips to Determine Fairness via Empirical Distribution
Scenario You must test whether a coin is fair by simulation. Question Write code that repeatedly simulates n coin flips, records the number of heads, ...
Implement anagram check and stable deduplication
Part A — Anagram checker: Write a function is_anagram(a: str, b: str, locale: str = 'en') -> bool that returns True iff a and b are anagrams under the...
Implement piecewise linear interpolation for time-to-empty
Time-to-Empty from a Discharge Curve (Piecewise Linear Interpolation) Implement a function time_to_empty(checkpoints, current_soc) that returns the nu...