This question evaluates understanding of feature preprocessing (z‑score normalization), linear logistic regression fitting and coefficient-based feature ranking, along with practical considerations such as handling zero-variance features, intercept treatment, and regularization; it is commonly asked to test applied skills in producing comparable feature scales and interpretable model weights. Category/domain: Machine Learning; abstraction level: applied, implementation-focused data science task appropriate for Data Scientist interviews.
You are given a binary classification training dataset:
X
: a 2D array of shape (n_samples, n_features) containing numeric features.
feature_names
: a list of length n_features with the feature names.
y
: a 1D binary array (0/1) of length n_samples.
Task:
X
using z-score standardization based on the training set:
(where and are the mean and standard deviation of feature over the training data.)
Clarify any modeling choices needed to make this well-defined (e.g., intercept, regularization, handling zero-variance features).