Compute and Rank by Jaccard Similarity
Company: Yelp
Role: Data Scientist
Category: Coding & Algorithms
Difficulty: Medium
Interview Round: Onsite
Quick Answer: The prompt evaluates set-based similarity metrics (Jaccard), robust tokenization and normalization for text data including Unicode and punctuation edge cases, streaming/iterator processing for very long inputs, and algorithmic efficiency for top-k retrieval and stable ranking.