This question evaluates a candidate's skills in image classification, data-quality assessment (label-noise detection, corrupted-sample filtering, class imbalance handling), robustness strategy comparison, and low-level neural network mechanics including vectorized forward/backward computation and numerical stability for NumPy-only implementations.
Context: You have a CIFAR-like dataset of 32×32 RGB images, 10–20 classes. You suspect 8–15% label noise, some corrupted images, and class imbalance. You must deliver both a data-quality plan and a minimal from-scratch learning core.
Build a baseline classifier and a practical plan to improve data quality and robustness. Clearly describe:
Assume you can train a small CNN/ResNet for the baseline. Keep a held-out test set untouched.
Implement forward and backward passes for a two-layer neural network (Linear → ReLU → Linear → Softmax Cross-Entropy) using only NumPy-like linear algebra:
Login required