Context
You are building an ML system to rank/promote shop ads in an e-commerce feed/search page. At serving time, the system may score candidate shop ads for a given user and context.
Assume you have access to:
-
User events (impressions, clicks, purchases, shop follows)
-
Shop metadata (category, price bands, inventory signals)
-
Query/context (search query, time, device)
-
Ad/auction signals (bid, budget pacing)
Questions
-
If you were to build the
shop-ads ranking model
, what feature families would you use? (Give examples.)
-
You have “a ton” of candidate features. How would you identify which ones are
useful
?
-
Include at least one
offline
approach and one
online
/production-safe approach.
-
If you were
not allowed to use a model-based importance method
(e.g., no SHAP/GBDT gain/permutation importance), how would you still find the key useful features?
-
Call out common pitfalls: leakage, feedback loops, cold start, and feature drift.