This question evaluates understanding of bias–variance trade-offs, model generalization and diagnostics for logistic regression applied to high‑dimensional, sparse, and imbalanced datasets, including feature sparsity and regularization considerations.
You are building a large‑scale binary classifier (e.g., click/conversion prediction for Google Display ads) with hundreds to thousands of mostly sparse, high‑cardinality features (one‑hot categorials, text/ids, and some numerics). The dataset is large and exhibits class imbalance.
Login required