PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep

Quick Overview

This question evaluates a candidate's ability to perform data manipulation and analytics with SQL and Python to detect license compliance, construct month-level time windows, and quantify revenue impact from pirated theme usage.

  • medium
  • Shopify
  • Data Manipulation (SQL/Python)
  • Data Scientist

Analyze Pirated Theme Usage Impact

Company: Shopify

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: medium

Interview Round: Technical Screen

An e-commerce platform lets shops install marketplace themes. Some themes are paid, and a shop should have a valid purchase/license before using a paid theme. Some shops may be using pirated copies of paid themes. Write SQL to analyze pirated theme usage, monthly trends, and revenue impact. Assume all dates and timestamps are in UTC. Analyze the window from `2023-01-01` through `2023-12-31`, inclusive. Monthly fields use the first day of the month. ### Tables #### `shops` | Column | Type | Description | |---|---:|---| | `shop_id` | BIGINT | Primary key for a shop. | | `created_at` | TIMESTAMP | Shop creation timestamp. | | `country` | VARCHAR | Shop country. | | `plan` | VARCHAR | Subscription plan, such as `basic`, `standard`, or `plus`. | | `vertical` | VARCHAR | Merchant business vertical. | #### `themes` | Column | Type | Description | |---|---:|---| | `theme_id` | BIGINT | Primary key for a theme. | | `theme_name` | VARCHAR | Theme name. | | `developer_id` | BIGINT | Theme developer identifier. | | `is_paid` | BOOLEAN | Whether the theme requires a paid license. | | `list_price_usd` | NUMERIC(10,2) | Official theme price in USD. | | `launched_at` | TIMESTAMP | Theme launch timestamp. | #### `shop_theme_usage` | Column | Type | Description | |---|---:|---| | `shop_id` | BIGINT | Foreign key to `shops.shop_id`. | | `theme_id` | BIGINT | Foreign key to `themes.theme_id`. | | `start_month` | DATE | First month the shop used the theme. Always the first day of a month. | | `end_month` | DATE | Last month the shop used the theme. Always the first day of a month. `NULL` means still active. | A shop-theme usage row is considered active in month `m` if `m >= start_month` and `m <= COALESCE(end_month, DATE '9999-12-01')`. #### `theme_purchases` | Column | Type | Description | |---|---:|---| | `order_id` | BIGINT | Primary key for a theme purchase. | | `shop_id` | BIGINT | Foreign key to `shops.shop_id`. | | `theme_id` | BIGINT | Foreign key to `themes.theme_id`. | | `purchased_at` | TIMESTAMP | Purchase timestamp. | | `amount_usd` | NUMERIC(10,2) | Amount paid in USD. | | `refunded_at` | TIMESTAMP | Refund timestamp. `NULL` means not refunded. | A shop has a valid license for a paid theme in month `m` if it purchased that theme before the first day of the next month and the purchase was not refunded before the first day of the next month. #### `shop_monthly_revenue` | Column | Type | Description | |---|---:|---| | `shop_id` | BIGINT | Foreign key to `shops.shop_id`. | | `month` | DATE | Revenue month, first day of month. | | `gross_merchandise_volume_usd` | NUMERIC(18,2) | Shop GMV in that month. | | `platform_revenue_usd` | NUMERIC(18,2) | Revenue earned by the platform from that shop in that month. | ### Definition of pirated usage A pirated usage is a shop-theme-month where: 1. The theme is paid: `themes.is_paid = TRUE`. 2. The shop is actively using the theme in that month. 3. The shop does not have a valid license for that theme in that month. ### Tasks 1. Create a monthly usage trend table with one row per month and the following columns: - `month` - `total_active_paid_theme_usages` - `pirated_active_theme_usages` - `pirating_shops` - `piracy_rate`, defined as `pirated_active_theme_usages / total_active_paid_theme_usages` 2. Create a monthly revenue impact table with one row per month and the following columns: - `month` - `potential_theme_revenue_lost_usd`, defined as the sum of `themes.list_price_usd` across pirated shop-theme-month rows - `platform_revenue_from_pirating_shops_usd`, counting each pirating shop's monthly platform revenue only once per month even if it uses multiple pirated themes - `avg_platform_revenue_per_pirating_shop_usd` - `pct_platform_revenue_from_pirating_shops`, defined as revenue from pirating shops divided by total platform revenue that month 3. Create a prioritization table to inform product strategy. Return the top 10 combinations of `theme_id`, `theme_name`, `plan`, and `vertical` by `potential_theme_revenue_lost_usd`, with these columns: - `theme_id` - `theme_name` - `plan` - `vertical` - `pirated_active_theme_usages` - `pirating_shops` - `potential_theme_revenue_lost_usd` - `platform_revenue_from_pirating_shops_usd` Be careful to avoid incorrect joins that duplicate revenue when a shop has multiple theme usage rows or multiple purchase rows.

Quick Answer: This question evaluates a candidate's ability to perform data manipulation and analytics with SQL and Python to detect license compliance, construct month-level time windows, and quantify revenue impact from pirated theme usage.

Last updated: May 7, 2026

Loading coding console...

PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Analyze Pirated Theme Usage - Shopify (medium)
  • Compute pirated-theme usage and revenue loss - Shopify (easy)
  • Calculate Pirated Usage and Revenue Loss - Shopify (hard)
  • Justify and harden your analytics and BI stack - Shopify (Medium)
  • Determine Growth of Pirated Theme Installations Over Years - Shopify (Medium)