Detect fraud events and extract PII

Q: Detect fraud events and extract PII

This question evaluates skills in parsing nested event data, extracting and deduplicating PII fields, and performing key-based correlation to detect related fraud events.

Q: How do I approach Coding & Algorithms interview questions?

Coding & Algorithms questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master coding & algorithms interviews.

Question

Loading...

You are given a list of event objects (dictionaries/JSON). Each event has:

event_type : either "underwriting" or "fraud_flag"
customer_details : a nested object that may contain PII fields such as address , email , phone , ssn (and may contain non-PII fields like credit_score )
other event-specific fields (e.g., loan_amount )

Example input:

[
  {
    "customer_details": {
      "address": "8941 Curry St",
      "credit_score": 505,
      "email": "cash.olson@yahoo.com",
      "phone": "309-144-1261",
      "ssn": "329340719"
    },
    "event_type": "underwriting",
    "loan_amount": 256
  },
  {
    "customer_details": {
      "address": "2143 Scarborough Ave",
      "credit_score": 774,
      "email": "hinson.parrott@hotmail.com",
      "phone": "117-570-8961",
      "ssn": "634841077"
    },
    "event_type": "fraud_flag"
  }
]

Implement a function that processes the events and returns:

A set of PII values seen across all events. Treat these fields as PII: address , email , phone , ssn . (Ignore non-PII fields like credit_score .)
For each "underwriting" event, determine whether it should be labeled as fraudulent . Use this rule:
- An underwriting event is fraud if there exists at least one "fraud_flag" event for the same customer , where customer identity is determined by matching customer_details.ssn .

Additional requirements/constraints:

Events are given as a list (can be assumed to fit in memory).
n can be large (e.g., up to 100k events), so aim for an efficient solution.
Some events may be missing some PII keys; only add present PII values to the set.

Clearly specify your output format (e.g., list of booleans aligned to underwriting events, or a list of underwriting events augmented with is_fraud).

Detect fraud events and extract PII

Quick Overview

Comments (0)