PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Data Manipulation (SQL/Python)/Meta

Define and compute shop visibility in SQL

Last updated: Mar 29, 2026

Quick Overview

This question evaluates the ability to define and compute shop-level visibility KPIs in SQL, testing skills in metric design (visibility_rate and impression_share), deduplication, session/window filtering, and data-quality concerns such as late-arriving events and bot filtering.

  • Medium
  • Meta
  • Data Manipulation (SQL/Python)
  • Data Scientist

Define and compute shop visibility in SQL

Company: Meta

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

You own the 'shop visibility' KPI for a marketplace. Define a precise metric and write SQL to compute it over the last 7 days (use today = 2025-09-01, so the window is 2025-08-26 to 2025-09-01 inclusive). Provide two related definitions: (A) visibility_rate = unique US users who saw ≥1 active product from a shop ÷ active US users; (B) impression_share = impressions of a shop's active products ÷ total impressions of all active products. Exclude deactivated products, deduplicate multiple impressions of the same product by the same user within a session, and restrict the denominator to 'active users' (users with ≥1 session in the window). Return: shop_id, visibility_rate, impression_share, unique_viewers, active_users, window_start, window_end. Then: (1) show how you'd parameterize this per-surface (feed vs shoppage), (2) handle late-arriving impressions and bot traffic, and (3) outline indexes or clustering to make the query efficient at 1B+ rows. Schema and small ASCII samples you must use: users(user_id INT, country STRING) +---------+---------+ | user_id | country | +---------+---------+ | 1 | US | | 2 | US | | 3 | CA | | 4 | US | +---------+---------+ shops(shop_id INT, name STRING) +---------+----------+ | shop_id | name | +---------+----------+ | 10 | Alpha | | 20 | Beta | +---------+----------+ products(product_id INT, shop_id INT, is_active BOOLEAN, created_at DATE) +------------+---------+-----------+------------+ | product_id | shop_id | is_active | created_at | +------------+---------+-----------+------------+ | 100 | 10 | true | 2025-08-01 | | 101 | 10 | true | 2025-08-20 | | 102 | 20 | true | 2025-07-15 | | 103 | 20 | false | 2025-06-10 | +------------+---------+-----------+------------+ impressions(impression_id BIGINT, ts TIMESTAMP, user_id INT, product_id INT, surface STRING) +---------------+---------------------+---------+------------+----------+ | impression_id | ts | user_id | product_id | surface | +---------------+---------------------+---------+------------+----------+ | 1 | 2025-08-26 10:00:00 | 1 | 100 | feed | | 2 | 2025-08-27 11:00:00 | 2 | 100 | feed | | 3 | 2025-08-27 12:00:00 | 2 | 101 | shoppage | | 4 | 2025-08-28 09:00:00 | 4 | 102 | feed | | 5 | 2025-09-01 13:00:00 | 1 | 102 | feed | +---------------+---------------------+---------+------------+----------+ sessions(session_id BIGINT, user_id INT, session_start TIMESTAMP) +------------+---------+---------------------+ | session_id | user_id | session_start | +------------+---------+---------------------+ | 500 | 1 | 2025-08-28 08:00:00 | | 501 | 2 | 2025-08-29 09:00:00 | | 502 | 3 | 2025-08-30 10:00:00 | | 503 | 4 | 2025-08-31 12:00:00 | +------------+---------+---------------------+ Assume a SQL dialect with CTEs and DATE/TIMESTAMP functions. Write the full query for (A) and (B) and briefly explain how you'd adapt it for incremental daily jobs and backfills.

Quick Answer: This question evaluates the ability to define and compute shop-level visibility KPIs in SQL, testing skills in metric design (visibility_rate and impression_share), deduplication, session/window filtering, and data-quality concerns such as late-arriving events and bot filtering.

Related Interview Questions

  • Compute ad impression conversion rates - Meta (medium)
  • Count unconnected posts and reactions - Meta (medium)
  • Count heavy callers in 7 days - Meta (medium)
  • Write SQL for call metrics - Meta (medium)
  • Write SQL for multi-account metrics - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Data Manipulation (SQL/Python)
4
0

You own the 'shop visibility' KPI for a marketplace. Define a precise metric and write SQL to compute it over the last 7 days (use today = 2025-09-01, so the window is 2025-08-26 to 2025-09-01 inclusive). Provide two related definitions: (A) visibility_rate = unique US users who saw ≥1 active product from a shop ÷ active US users; (B) impression_share = impressions of a shop's active products ÷ total impressions of all active products. Exclude deactivated products, deduplicate multiple impressions of the same product by the same user within a session, and restrict the denominator to 'active users' (users with ≥1 session in the window). Return: shop_id, visibility_rate, impression_share, unique_viewers, active_users, window_start, window_end. Then: (1) show how you'd parameterize this per-surface (feed vs shoppage), (2) handle late-arriving impressions and bot traffic, and (3) outline indexes or clustering to make the query efficient at 1B+ rows.

Schema and small ASCII samples you must use: users(user_id INT, country STRING) +---------+---------+ | user_id | country | +---------+---------+ | 1 | US | | 2 | US | | 3 | CA | | 4 | US | +---------+---------+

shops(shop_id INT, name STRING) +---------+----------+ | shop_id | name | +---------+----------+ | 10 | Alpha | | 20 | Beta | +---------+----------+

products(product_id INT, shop_id INT, is_active BOOLEAN, created_at DATE) +------------+---------+-----------+------------+ | product_id | shop_id | is_active | created_at | +------------+---------+-----------+------------+ | 100 | 10 | true | 2025-08-01 | | 101 | 10 | true | 2025-08-20 | | 102 | 20 | true | 2025-07-15 | | 103 | 20 | false | 2025-06-10 | +------------+---------+-----------+------------+

impressions(impression_id BIGINT, ts TIMESTAMP, user_id INT, product_id INT, surface STRING) +---------------+---------------------+---------+------------+----------+ | impression_id | ts | user_id | product_id | surface | +---------------+---------------------+---------+------------+----------+ | 1 | 2025-08-26 10:00:00 | 1 | 100 | feed | | 2 | 2025-08-27 11:00:00 | 2 | 100 | feed | | 3 | 2025-08-27 12:00:00 | 2 | 101 | shoppage | | 4 | 2025-08-28 09:00:00 | 4 | 102 | feed | | 5 | 2025-09-01 13:00:00 | 1 | 102 | feed | +---------------+---------------------+---------+------------+----------+

sessions(session_id BIGINT, user_id INT, session_start TIMESTAMP) +------------+---------+---------------------+ | session_id | user_id | session_start | +------------+---------+---------------------+ | 500 | 1 | 2025-08-28 08:00:00 | | 501 | 2 | 2025-08-29 09:00:00 | | 502 | 3 | 2025-08-30 10:00:00 | | 503 | 4 | 2025-08-31 12:00:00 | +------------+---------+---------------------+

Assume a SQL dialect with CTEs and DATE/TIMESTAMP functions. Write the full query for (A) and (B) and briefly explain how you'd adapt it for incremental daily jobs and backfills.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Meta•More Data Scientist•Meta Data Scientist•Meta Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.