PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Machine Learning/Other

Design a hybrid marketplace fraud system

Last updated: Mar 29, 2026

Quick Overview

This question evaluates competency in designing hybrid fraud detection systems, covering feature engineering across text, image, graph and telemetry signals, hybrid modeling approaches, cost-sensitive thresholding, and operational concerns such as online evaluation, monitoring, and red‑team exercises.

  • hard
  • Other
  • Machine Learning
  • Data Scientist

Design a hybrid marketplace fraud system

Company: Other

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Onsite

Build a fraud detection system for (a) fake marketplace listings and (b) fabricated education credentials. 1) List at least 10 features spanning unique identifiers (device ID, IP, payment instruments), text/image signals (low‑res, duplicate embeddings), behavioral patterns (burst account creation, connection request acceptance rates), and network features (triadic closure among 'alumni'). 2) Combine supervised learning (to catch known patterns) with anomaly detection (to flag novel attacks); specify algorithms and how to resolve class imbalance and noisy labels. 3) Choose an operating threshold using explicit costs of false negatives vs. false positives and show how that maps onto ROC/PR curves. 4) Propose a risk‑based step‑up authentication policy (e.g., <10% auto‑pass, 10–50% 2FA, >50% auto‑block) and an online evaluation plan that limits collateral damage to legitimate users. 5) Define concept‑drift monitoring, retraining cadence, and red‑team simulation to uncover adaptive fraud.

Quick Answer: This question evaluates competency in designing hybrid fraud detection systems, covering feature engineering across text, image, graph and telemetry signals, hybrid modeling approaches, cost-sensitive thresholding, and operational concerns such as online evaluation, monitoring, and red‑team exercises.

Related Interview Questions

  • Derive and regularize logistic regression - Other (hard)
  • Design anomaly detection and handle imbalanced logistic regression - Other (Medium)
  • Extract companies from noisy text - Other (hard)
  • Evaluate and select K in K-means - Other (medium)
  • Explain SVM kernels and complexity - Other (hard)
Other logo
Other
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Machine Learning
2
0
Loading...

Design a Fraud Detection System for a Marketplace and Profile Credentials

Context

You are a data scientist at a two‑sided marketplace where users can post listings (goods/services) and maintain profiles that may include education credentials. You must design detection for two abuse types:

  • (a) Fake marketplace listings
  • (b) Fabricated education credentials on user profiles

Assume you have event logs, listing content (text/images), profile data, device/network telemetry, payment metadata, and graph data of interactions (messages, transactions, connections).

Tasks

  1. Feature Engineering
    • Propose at least 10 concrete features spanning: unique identifiers (e.g., device ID, IP, payment instruments), text/image signals (e.g., low‑res, duplicate embeddings), behavioral patterns (e.g., burst account creation, connection request acceptance rates), and network features (e.g., triadic closure among claimed "alumni"). Include features tailored to both (a) listings and (b) credentials.
  2. Modeling Approach
    • Combine supervised learning (to capture known patterns) with anomaly detection (to flag novel attacks). Specify algorithms for tabular, text, image, and graph data. Explain how you will handle class imbalance and noisy/delayed labels.
  3. Thresholding with Costs
    • Choose an operating threshold using explicit costs of false negatives vs. false positives. Show how this choice maps onto ROC and PR curves, and provide a small numeric example.
  4. Risk‑Based Step‑Up and Online Evaluation
    • Propose a risk‑based step‑up policy (e.g., <10% auto‑pass, 10–50% 2FA, >50% auto‑block/quarantine). Design an online evaluation plan that limits collateral damage to legitimate users while gathering the labels you need.
  5. Monitoring, Retraining, and Red‑Team
    • Define concept‑drift monitoring (metrics and alerting), retraining cadence and promotion strategy, and a red‑team simulation program to uncover adaptive fraud.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Other•More Data Scientist•Other Data Scientist•Other Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.