You have extreme class imbalance (positive rate ~1%). You score 12 examples as follows (id, true_label, score): A,1,0.92; B,0,0.90; C,0,0.88; D,0,0.70; E,1,0.62; F,0,0.58; G,0,0.55; H,0,0.54; I,1,0.53; J,0,0.50; K,0,0.20; L,0,0.10. Tasks: 1) Compute precision, recall, and F1 at thresholds of 0.90, 0.60, and 0.50. 2) Which threshold maximizes F1 here, and why might business costs still argue for a different threshold? 3) Explain when PR curves are more informative than ROC curves and what AUPRC vs AUROC would indicate in this setting. 4) If you must deliver exactly top-k alerts (k=2), compute precision@k and recall@k and discuss how calibration affects thresholding.