Explain imbalance, metrics, bias-variance, Transformers vs. CNNs
Company: Amazon
Role: Machine Learning Engineer
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
You are given a highly imbalanced binary classification dataset (about 1% positives) in a fraud-detection setting.
1) Describe strategies to address class imbalance at the data level (e.g., resampling), algorithm level (e.g., class weights, focal loss), and decision level (e.g., threshold tuning, cost-sensitive decisions).
2) Explain the bias–variance tradeoff and how you would diagnose and mitigate high bias versus high variance in this scenario.
3) Select and justify appropriate evaluation metrics (e.g., precision–recall AUC vs. ROC AUC, precision@k, recall@k, F1, calibration/Brier score) and discuss when each is preferable.
4) Compare Transformers and CNNs: their inductive biases, typical inputs, computational tradeoffs, and when you would choose one over the other for text, sequences, images, or tabular fraud features.
Quick Answer: This question evaluates understanding of class imbalance strategies, the bias–variance tradeoff, selection and interpretation of evaluation metrics for highly imbalanced binary classification, and architectural trade-offs between Transformers and CNNs in the Machine Learning domain.