Debug transformer and train classifier

Q: Debug transformer and train classifier

This question evaluates debugging and practical implementation skills for transformer-based text classification, covering model correctness, training loop integrity, data preprocessing, and evaluation metrics within the Machine Learning / Natural Language Processing domain.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Debug and Fix a Transformer Text Classifier, Then Train and Evaluate It

Context

You inherit a small codebase for a transformer-based text classifier. There are four failing unit tests: two correspond to previously documented ("known") issues; two are unexpected ("novel"). Your task is to make the model train and evaluate correctly, and to demonstrate a robust training/evaluation pipeline on a labeled dataset.

Assumptions (to make the task self-contained):

Language: Python 3.10+
Libraries: PyTorch, Hugging Face Transformers, scikit-learn, pandas, numpy
Dataset: a CSV file with columns text (string) and label (int), single-label classification with K classes.

Tasks

Identify the root cause of each failing test (2 known bugs, 2 novel bugs), and fix the model/training code so all tests pass.
Provide a clean, minimal reference implementation of the model and training loop that avoids these bugs.
Given a labeled dataset, analyze class balance and basic feature distributions (e.g., text length, token frequency), then train the classifier.
Report key performance metrics (accuracy, precision, recall, F1; ROC-AUC when binary), and include guardrails for class imbalance and reproducibility.

Debug transformer and train classifier

Debug and Fix a Transformer Text Classifier, Then Train and Evaluate It

Context

Tasks

Solution

Comments (0)