This question evaluates predictive modeling and applied data science skills for CTR prediction, covering handling extreme class imbalance, delayed feedback, sparse/high‑cardinality feature encoding, time‑aware validation, evaluation and calibration of probabilistic scores, and online A/B validation; it falls squarely in the Machine Learning domain and tests both conceptual understanding and practical application. It is commonly asked because it probes reasoning about real‑world production challenges—metric selection (ROC vs PR), thresholding under business costs, calibration methods, drift detection and avoiding feedback loops—without requiring specific implementation details.
You are building a model to predict the probability that an ad impression results in a click within 24 hours. The base positive rate is approximately 0.7%.
Available features:
Labels are delayed: some clicks arrive up to 24 hours after the impression.
Login required