This question evaluates a data scientist's competency in designing end-to-end machine learning systems for detecting fake accounts, covering problem framing, feature engineering across signals and time windows, modeling choices (supervised, unsupervised, semi/weakly-supervised, graph-based), evaluation metrics and A/B testing, monitoring for data/concept drift, and quantifying business impact. It is commonly asked to assess the ability to balance precision–recall and the business costs of false positives versus false negatives, and falls under the Machine Learning category with a mix of conceptual understanding and practical application.

You are a data scientist at a social‑commerce platform responsible for trust and safety. You need to design a system to detect and mitigate fake accounts (bots, spam, fraud, coordinated inauthentic behavior) while minimizing friction for real users.
Hints: Discuss feature engineering, supervised vs. unsupervised methods, precision–recall trade-offs, and A/B testing for business impact.
Login required