This question evaluates a candidate's competency in building end-to-end tabular classification pipelines, including data loading and splitting, missing-value handling, categorical encoding, feature scaling, model training and comparison, hyperparameter tuning, metric-based evaluation, model persistence, and batch inference.
You are given a tabular dataset in a CSV file and asked to build an end-to-end machine learning pipeline for a classification problem. Assume the dataset contains a column named target (binary classification by default). You may extend to multiclass if desired.
predict(input_csv_path, output_csv_path)
function or CLI.
If you choose a neural network, include a correct training loop with optimizer initialization, forward pass, loss computation, backward pass, and optimizer step.
Login required