PracHub
QuestionsPremiumLearningGuidesCheatsheetNEW

Quick Overview

This question evaluates SQL data manipulation and time-series analytics skills, focusing on interval overlap logic, aggregation of distinct entities, and cumulative revenue calculations.

  • easy
  • Shopify
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute pirated-theme usage and revenue loss

Company: Shopify

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

You work on a theme marketplace. Some shops install **pirated themes** instead of paying for official themes. Assume all timestamps are in **UTC**. ## Tables ### `shops` - `shop_id` BIGINT PRIMARY KEY - `created_at` TIMESTAMP ### `theme_installs` Represents a shop having a theme active for a time range. - `shop_id` BIGINT (FK → `shops.shop_id`) - `theme_id` BIGINT - `valid_from` TIMESTAMP - `valid_to` TIMESTAMP NULL — NULL means the install is still active ### `pirated_themes` List of themes known to be pirated. - `theme_id` BIGINT PRIMARY KEY ### `theme_prices` Monthly price for an official theme. - `theme_id` BIGINT PRIMARY KEY - `price_usd_per_month` NUMERIC(10,2) ### `calendar_months` A helper table with one row per month. - `month_start` DATE PRIMARY KEY — e.g., 2024-01-01 ## Q1 — Is pirated usage increasing? For each calendar month, compute the percent of shops that have **at least one active install of a pirated theme at any time during that month**. - Define “active during a month” as the install interval overlapping the month: `valid_from < next_month_start AND coalesce(valid_to, <as_of_timestamp>) >= month_start` - Use `COALESCE(valid_to, <as_of_timestamp>)` to handle NULL `valid_to`. ### Output - `month_start` - `pirated_shops` (count of distinct `shop_id` with pirated theme active during the month) - `total_shops` (count of distinct `shop_id` with any theme active during the month) - `pirated_shop_pct` (`pirated_shops / total_shops`) ## Q2 — Estimate revenue loss “over time” (cumulative) Assume each pirated theme would have generated revenue equal to its official theme’s monthly price in `theme_prices` while it is active. Compute: 1) `monthly_revenue_loss_usd`: for each month, sum `price_usd_per_month` across distinct `(shop_id, theme_id)` that are pirated and active during the month. 2) `cumulative_revenue_loss_usd`: the rolling cumulative sum up through each month (i.e., “as of January”, “as of February”, …). ### Output - `month_start` - `monthly_revenue_loss_usd` - `cumulative_revenue_loss_usd` Notes: - You may use `calendar_months` and a join to expand installs into months. - Clearly handle NULL `valid_to` using `COALESCE`.

Quick Answer: This question evaluates SQL data manipulation and time-series analytics skills, focusing on interval overlap logic, aggregation of distinct entities, and cumulative revenue calculations.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Analyze Pirated Theme Usage Impact - Shopify (medium)
  • Calculate Pirated Usage and Revenue Loss - Shopify (hard)
  • Justify and harden your analytics and BI stack - Shopify (Medium)
  • Determine Growth of Pirated Theme Installations Over Years - Shopify (Medium)
  • Analyze Pirate Theme Usage Growth Over Time - Shopify (Medium)