PracHub
QuestionsCoachesLearningGuidesInterview Prep
|Home/Coding & Algorithms/Shopify

Identify Pirate Themes Using Similarity Score Algorithm

Last updated: Mar 29, 2026

Quick Overview

This question evaluates understanding of similarity scoring, set-based comparison of structured records, and practical data deduplication techniques commonly used in data science.

  • Medium
  • Shopify
  • Coding & Algorithms
  • Data Scientist

Identify Pirate Themes Using Similarity Score Algorithm

Company: Shopify

Role: Data Scientist

Category: Coding & Algorithms

Difficulty: Medium

Interview Round: Technical Screen

##### Scenario Engineering wants an automated way to spot custom themes that are probably just pirate themes in disguise. ##### Question Write Python that takes two lists (A and B) and returns their similarity score defined as len(intersection) / len(union). Given pirate_themes (list of dicts) and custom_themes (list of dicts), identify which custom themes are likely pirates using the similarity score and explain your threshold choice. ##### Hints Implement a Jaccard similarity; iterate over dictionaries by a chosen key set; threshold of 0.5 is typical.

Quick Answer: This question evaluates understanding of similarity scoring, set-based comparison of structured records, and practical data deduplication techniques commonly used in data science.

Related Interview Questions

  • Grid Robot Command Simulator - Shopify (medium)
  • Compute Theme Similarity - Shopify (medium)
  • Compute Jaccard Similarity for Lists - Shopify (medium)
  • Implement URL Shortening Codec - Shopify (medium)
  • Simulate a rover fleet - Shopify (medium)
|Home/Coding & Algorithms/Shopify

Identify Pirate Themes Using Similarity Score Algorithm

Shopify logo
Shopify
Aug 4, 2025, 10:55 AM
MediumData ScientistTechnical ScreenCoding & Algorithms
67
0
Scenario

Engineering wants an automated way to spot custom themes that are probably just pirate themes in disguise.

Question

Write Python that takes two lists (A and B) and returns their similarity score defined as len(intersection) / len(union). Given pirate_themes (list of dicts) and custom_themes (list of dicts), identify which custom themes are likely pirates using the similarity score and explain your threshold choice.

Hints

Implement a Jaccard similarity; iterate over dictionaries by a chosen key set; threshold of 0.5 is typical.

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Coding & Algorithms•More Shopify•More Data Scientist•Shopify Data Scientist•Shopify Coding & Algorithms•Data Scientist Coding & Algorithms
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • AI Coding Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.