Design a lead scoring model for marketing
Company: IBM
Role: Data Scientist
Category: Machine Learning
Difficulty: easy
Interview Round: Technical Screen
You are on a marketing data science team building a **lead scoring** system.
Context:
- Each “lead” (user or account) arrives through marketing channels (ads, email, organic).
- Sales has limited capacity, so the score will be used to **prioritize outreach**.
Design an end-to-end approach:
1) Define the prediction target (label) and the prediction time (when the score is computed).
2) Propose feature sets and data sources. Call out potential leakage.
3) Choose a modeling approach (baseline + more advanced) and explain tradeoffs.
4) Define evaluation metrics aligned to business constraints (top-K, lift, calibration, revenue).
5) Describe how you would pick an operating threshold / routing policy.
6) Discuss key risks: class imbalance, selection bias (sales touches are not random), fairness, drift, and how you’d monitor and iterate.
Quick Answer: This question evaluates a candidate's ability to design an end-to-end machine learning lead scoring system, testing competencies in defining prediction targets and timing, feature engineering and leakage awareness, model selection, evaluation aligned to business goals, and operational concerns like selection bias, fairness, and concept drift.