PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Coding & Algorithms/Shopify

Identify Pirate Themes Using Similarity Score Algorithm

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of similarity scoring, set-based comparison of structured records, and practical data deduplication techniques commonly used in data science.

  • Medium
  • Shopify
  • Coding & Algorithms
  • Data Scientist

Identify Pirate Themes Using Similarity Score Algorithm

Company: Shopify

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Scenario Engineering wants an automated way to spot custom themes that are probably just pirate themes in disguise. ##### Question Write Python that takes two lists (A and B) and returns their similarity score defined as len(intersection) / len(union). Given pirate_themes (list of dicts) and custom_themes (list of dicts), identify which custom themes are likely pirates using the similarity score and explain your threshold choice. ##### Hints Implement a Jaccard similarity; iterate over dictionaries by a chosen key set; threshold of 0.5 is typical.

Quick Answer: This question evaluates understanding of similarity scoring, set-based comparison of structured records, and practical data deduplication techniques commonly used in data science.

Related Interview Questions

  • Compute Jaccard Similarity for Lists - Shopify (medium)
  • Implement URL Shortening Codec - Shopify (medium)
  • Simulate a rover fleet - Shopify (medium)
  • Simulate robot moves on a grid - Shopify (medium)
  • Implement a Capacity-Bounded Cache - Shopify (medium)
Shopify logo
Shopify
Aug 4, 2025, 10:55 AM
Data Scientist
Technical Screen
Coding & Algorithms
64
0
Scenario

Engineering wants an automated way to spot custom themes that are probably just pirate themes in disguise.

Question

Write Python that takes two lists (A and B) and returns their similarity score defined as len(intersection) / len(union). Given pirate_themes (list of dicts) and custom_themes (list of dicts), identify which custom themes are likely pirates using the similarity score and explain your threshold choice.

Hints

Implement a Jaccard similarity; iterate over dictionaries by a chosen key set; threshold of 0.5 is typical.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Shopify•More Data Scientist•Shopify Data Scientist•Shopify Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.