Detect clickbait without labels, then supervise

Q: Detect clickbait without labels, then supervise

This is a Machine Learning interview question from Other for Data Scientist roles. View the full question and solution on PracHub.

Q: How do I approach Machine Learning interview questions?

Machine Learning questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master machine learning interviews.

Question

Detecting Clickbait Ads Without Labeled Data

Context

You are asked to detect clickbait ad creatives when there is no labeled training data. You have impression/click logs, post-click signals (e.g., dwell time, bounce), and metadata (creative text/image, destination/publisher). The goal is to bootstrap weak labels from behavior, then train a supervised model that avoids over-blocking legitimate, high-quality novelty and is robust to adversarial gaming.

Tasks

Feature engineering: Propose features that capture short-term CTR and its decay among users who previously clicked a creative (e.g., 14-day within-viewer CTR delta), as well as post-click signals (dwell time on landing page, bounce).
Weak labeling via clustering: Cluster creatives in the CTR × decay space. Specify how you select the "likely clickbait" cluster while minimizing over-blocking of high-quality novelty.
Supervised model: Convert weak labels into a supervised model. Describe model features (e.g., text embeddings, historical user–ad interaction, publisher quality), training regimen, evaluation metrics (e.g., precision@k, uplift on long-term engagement), and guardrails.
Abuse prevention and monitoring: Outline how to prevent gaming (e.g., rotating creatives) and how to monitor for regression to the mean over time.

Detect clickbait without labels, then supervise

Detecting Clickbait Ads Without Labeled Data

Context

Tasks

Solution (Locked)

Comments (0)