Given a labeled dataset for binary classification, implement an end-to-end Python solution to train and analyze a classifier. Tasks:
(
-
perform EDA (missingness, outliers, leakage checks, target/feature drift over time),
(
-
create time-aware, stratified train/validation/test splits with proper cross-validation,
(
-
build a strong baseline and at least one improved model,
(
-
handle class imbalance (cost-sensitive loss, resampling, thresholds),
(
-
tune hyperparameters without leakage,
(
-
compute and compare metrics (ROC-AUC, PR-AUC, F1, calibration/Brier, confusion matrix at chosen threshold),
(
-
conduct error analysis by slice and feature,
(
-
produce a reproducible training script with CLI, config, and seed control,
(
-
explain feature importance/SHAP and validate with ablations, and
(
-
document risks, fairness checks, and monitoring hooks for production. Provide code snippets and explain your design choices.