Train and analyze a classifier

Q: Train and analyze a classifier

This is a Data Manipulation (SQL/Python) interview question from OpenAI for Machine Learning Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Data Manipulation (SQL/Python) interview questions?

Data Manipulation (SQL/Python) questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master data manipulation (sql/python) interviews.

Question

Given a labeled dataset for binary classification, implement an end-to-end Python solution to train and analyze a classifier. Tasks: (

perform EDA (missingness, outliers, leakage checks, target/feature drift over time), (
create time-aware, stratified train/validation/test splits with proper cross-validation, (
build a strong baseline and at least one improved model, (
handle class imbalance (cost-sensitive loss, resampling, thresholds), (
tune hyperparameters without leakage, (
compute and compare metrics (ROC-AUC, PR-AUC, F1, calibration/Brier, confusion matrix at chosen threshold), (
conduct error analysis by slice and feature, (
produce a reproducible training script with CLI, config, and seed control, (
explain feature importance/SHAP and validate with ablations, and (
document risks, fairness checks, and monitoring hooks for production. Provide code snippets and explain your design choices.

Train and analyze a classifier

Comments (0)