Debug ML pipeline and build text parser | Scale AI
|Home/Data Manipulation (SQL/Python)/Scale AI
Debug ML pipeline and build text parser
Scale AI
Sep 6, 2025, 12:00 AM
Machine Learning Engineer
Technical Screen
Data Manipulation (SQL/Python)
0
0
Given raw text files with noisy formatting, implement a robust parser that outputs structured examples; handle delimiters, quoting/escaping, encodings/Unicode, missing fields, and malformed lines, and describe how you would test it.
In a provided ML project (data loading, preprocessing, training, evaluation), identify and fix three defects (e.g., index off-by-one in tokenization, train/test leakage, incorrect loss reduction, nondeterministic seeding, or shape mismatches). Explain your rapid debugging approach (stack traces, assertions, binary search logging, minimal repros).
Describe how you would validate the fixes under a 60-minute time limit (unit tests, end-to-end run, metrics sanity checks, and regression guards).