PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates skills in CSV parsing, string normalization, field-level validation, and rule-based text matching, emphasizing attention to edge cases such as trimming whitespace, substring checks, and word-overlap logic.

  • medium
  • Stripe
  • Coding & Algorithms
  • Software Engineer

Validate KYC CSV Records

Company: Stripe

Role: Software Engineer

Category: Coding & Algorithms

Difficulty: medium

Interview Round: Technical Screen

Implement a function that validates business-verification records from a CSV-formatted string. - The first line is a header with exactly 6 comma-separated columns: `col1,col2,col3,col4,col5,col6`. - Each remaining line is a data row with exactly 6 comma-separated fields. - You may assume fields do not contain embedded commas or quoted newlines. For each data row, return `VERIFIED` if all of the following rules hold; otherwise return `NOT VERIFIED`. 1. **No empty fields** - All 6 fields must be present. - After trimming leading and trailing whitespace, every field must be non-empty. 2. **Column 5 length constraint** - After trimming whitespace, the length of `col5` must be between `5` and `31`, inclusive. 3. **Forbidden terms in `col2`** - Case-insensitively, `col2` must not contain any of these terms as substrings: `company`, `firm`, `co.`, `corporation`, `group`. 4. **Word overlap between `col2` and `col4` or `col5`** - Convert `col2`, `col4`, and `col5` to lowercase and split each one by whitespace into words. - Ignore the words `llc` and `inc` in all three columns. - Let `W2` be the remaining words from `col2`. - The row passes this rule if at least 50% of the words in `W2` also appear in the words from `col4`, or at least 50% of the words in `W2` also appear in the words from `col5`. - Matching is exact string equality after lowercasing. - You may assume `W2` is non-empty after removing ignored words. Return the verification result for each data row in the original order.

Quick Answer: This question evaluates skills in CSV parsing, string normalization, field-level validation, and rule-based text matching, emphasizing attention to edge cases such as trimming whitespace, substring checks, and word-overlap logic.

Implement a function that validates business-verification records from a CSV-formatted string. The input always starts with the header `col1,col2,col3,col4,col5,col6`, followed by zero or more data rows. Return one result for each data row in order: `VERIFIED` if the row satisfies every rule below, otherwise `NOT VERIFIED`. Validation rules for each data row: 1. The row must contain exactly 6 comma-separated fields. 2. After trimming leading and trailing whitespace, all 6 fields must be non-empty. 3. After trimming whitespace, the length of `col5` must be between 5 and 31 inclusive. 4. Case-insensitively, `col2` must not contain any of these forbidden substrings: `company`, `firm`, `co.`, `corporation`, `group`. 5. Let `W2` be the words in `col2` after lowercasing, splitting by whitespace, and removing the words `llc` and `inc`. Do the same for `col4` and `col5`. - The row passes this rule if at least 50% of the words in `W2` also appear in the words from `col4`, or at least 50% of the words in `W2` also appear in the words from `col5`. - Word matching is exact string equality after lowercasing. - You may assume `W2` is non-empty for valid test data.

Constraints

  • 1 <= total length of `csv_data` <= 10^6
  • The first line is the header `col1,col2,col3,col4,col5,col6`
  • Fields do not contain embedded commas or quoted newlines
  • For overlap checking, split words only by whitespace
  • `W2` can be assumed non-empty after removing `llc` and `inc`

Examples

Input: ("col1,col2,col3,col4,col5,col6",)

Expected Output: []

Explanation: There are no data rows, so the result is an empty list.

Input: ("col1,col2,col3,col4,col5,col6\n1,Blue Ocean LLC,x,Ocean Blue,Blue LLC,z\n2,North GROUP Holdings,x,North Holdings,North Holdings,z\n3,Sunrise Inc Bakery,x,Random Name, Sunrise Bakery ,z\n4,Red Apple Bakery Cafe,x,Apple Cafe,ValidFive,z\n5,Green Field Market,x,Green Shop,Field,z",)

Expected Output: ["VERIFIED", "NOT VERIFIED", "VERIFIED", "VERIFIED", "NOT VERIFIED"]

Explanation: Row 1 matches fully with col4. Row 2 contains the forbidden term 'group'. Row 3 matches fully with col5 after trimming and ignoring 'inc'. Row 4 has exactly 50% overlap with col4 (2 of 4 words). Row 5 has only 1 of 3 words overlapping with either col4 or col5.

Input: ("col1,col2,col3,col4,col5,col6\n1,Alpha Beta,x,Alpha Beta,abcde,z\n2,Alpha Beta,x,Alpha Beta, ,z\n3,Alpha Beta,x,Alpha,abcd,z",)

Expected Output: ["VERIFIED", "NOT VERIFIED", "NOT VERIFIED"]

Explanation: Row 1 is valid and uses the lower bound length 5 for col5. Row 2 fails because col5 is empty after trimming. Row 3 fails because col5 has length 4.

Input: ("col1,col2,col3,col4,col5,col6\n1,North Star LLC,x,North Star,1234567890123456789012345678901,z\n2,Fresh Market,x,Fresh,12345",)

Expected Output: ["VERIFIED", "NOT VERIFIED"]

Explanation: Row 1 is valid and uses the upper bound length 31 for col5. Row 2 has only 5 fields instead of 6, so it is not verified.

Hints

  1. Process one row at a time: split by commas, trim each field, and fail fast as soon as any rule is broken.
  2. For the 50% overlap rule, keep `col2` as a list of words, but convert the words from `col4` and `col5` to sets for fast membership checks. Use `2 * matches >= len(W2)` to test the threshold without floating-point math.
Last updated: Jun 2, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Assign Reviewers from Changed Files - Stripe (medium)
  • Generate Account Email Notifications - Stripe (medium)
  • Calculate Transaction Fees - Stripe (medium)
  • Build an Account Transfer Ledger - Stripe (medium)
  • Implement Validation and String Compression - Stripe (hard)