Build Naive Bayes spam classifier with F1

Q: Build Naive Bayes spam classifier with F1

This is a Machine Learning interview question from Disney for Machine Learning Engineer roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

You are given a text classification dataset for spam detection (binary labels: spam vs not_spam) in a Jupyter notebook environment.

Task

Preprocess the text (basic cleaning/tokenization is sufficient).
Convert text to features suitable for Naive Bayes (e.g., bag-of-words or TF-IDF).
Train a Naive Bayes classifier.
Evaluate the model using F1 score (clearly state whether it is the F1 for the positive class or a specific averaging scheme).
Run the trained model on a few test examples and show predicted labels (and optionally probabilities).

Constraints / Notes

The dataset may be class-imbalanced.
You should avoid data leakage (fit text vectorizer only on training data).
You may choose reasonable train/validation splitting if only one labeled set is provided.

Build Naive Bayes spam classifier with F1

Task

Constraints / Notes

Solution

Comments (0)