Machine Learning Engineer Data Manipulation (SQL/Python) Interview Questions
Master your tech interview with our curated database of real questions from top companies.
Debug ML pipeline and build text parser
- Given raw text files with noisy formatting, implement a robust parser that outputs structured examples; handle delimiters, quoting/escaping, encodin...
Implement a robust Python generator
Given a list of integers, write a Python generator that yields the integers from the list while handling edge cases such as None values, empty input, ...
Implement and vectorize NumPy Conv2D
Implement a 2D convolution operation from scratch using NumPy only (no TensorFlow or PyTorch). Assume NCHW input shape (N, C_in, H_in, W_in) and weigh...
Load and prepare JSON for modeling
Using Python in a Jupyter notebook, load a JSON dataset with fields: ( 1) hours spent reading A posts (float), ( 2) hours spent reading B posts (float...
Transform flat keys into nested dictionary
You are given a flat collection of parameter keys like ['layer1.attention.q_proj.weight', 'layer1.attention.k_proj.weight', 'layer1.mlp.fc1.weight', ....
Compute nearest index within threshold after walking distances
You are given: ( 1) points: a list of N 2D coordinates in miles, points[i] = [x_i, y_i], ordered; ( 2) distances: a list of M nonnegative floats (mile...
Implement vectorized NumPy ops and explain broadcasting
Implement vectorized NumPy code for: (a) computing pairwise cosine similarity between two real-valued matrices X (shape n×d) and Y (shape m×d) without...
Train and analyze a classifier
Given a labeled dataset for binary classification, implement an end-to-end Python solution to train and analyze a classifier. Tasks: ( 1) perform EDA ...
Implement simulation-based portfolio optimizer in Python
Given a pandas DataFrame 'returns' of daily asset returns (index: dates; columns: tickers) and an annualized risk‑free rate r_f, implement a simulatio...
Process CSV for portfolio returns and metrics
Given one or more CSV files containing daily asset prices or returns and optional portfolio weights, write Python (pandas) code to: a) load, clean, an...