How do I practice SQL interview questions?

PracHub provides an interactive SQL console where you can write and test queries against real database schemas. Get instant feedback and compare your solution with the expected output.

What difficulty level is this coding question?

This is a easy difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at LinkedIn.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at LinkedIn during technical interviews.

Write SQL for rankings, state, and aggregations

Company: LinkedIn

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

You are given several independent datasets. For each dataset, write the required SQL query (ANSI SQL is fine). Assume dates are in UTC and `DATE` columns are calendar dates (no time component) unless otherwise specified. --- ## Dataset A — Continent populations ### Table: `world_population` | column | type | notes | |---|---|---| | continent | STRING | e.g., `North America` | | country | STRING | country name | | population | BIGINT | population for that country | **Task A1.** For each `continent`, return the single country with the **largest** `population`. **Output:** `continent, country, population` **Task A1 follow-up.** For each `continent`, return the **top 2** countries by `population`. **Output:** `continent, country, population` **Task A2.** For each `continent`, compute the **population share** of each of the **top 2** countries, where: - `total_population` = sum of `population` within the continent - `population_share_pct` = `100 * population / total_population` **Output:** `continent, country, population, total_population, population_share_pct` --- ## Dataset B — Push notification status as of a cutoff date You have a baseline snapshot as of **2020-01-31** and an action log for February. ### Table: `status` "Contains all members’ latest push notification setting status as of 2020-01-31." | column | type | notes | |---|---|---| | member_id | BIGINT | primary key | | status | STRING | one of `on`, `off` | ### Table: `actions` "All actions members made in Feb 2020 (after 2020-01-31). Assume at most one action per member per day." | column | type | notes | |---|---|---| | member_id | BIGINT | may include new members not in `status` | | date_sk | DATE | action date in Feb 2020 | | action | STRING | one of `turn_on`, `turn_off` | **Task B1.** Return each member’s **current notification status as of 2020-02-29**. Requirements: - If a member has one or more rows in `actions`, their latest action by `date_sk` determines the current status. - Members who never acted in Feb retain their `status` from the snapshot. - The `actions` table may include **new members** not present in `status`; they must still appear in the output. **Output:** `member_id, current_status` **Task B1 follow-up.** If the `actions` table did **not** contain `date_sk` (only action events without timestamps, or only aggregated action counts), explain what additional assumptions/data you would need, and under a reasonable assumption, compute the final status. --- ## Dataset C — Video posts and members ### Table: `video_posts` | column | type | notes | |---|---|---| | post_date | DATE | date the video was posted | | memberid | BIGINT | member who posted | | video_length | INT | length in seconds | ### Table: `members` | column | type | notes | |---|---|---| | memberid | BIGINT | primary key | | country | STRING | e.g., `usa`, `uk` | | join_date | DATE | date the member joined | **Task C1.** How many members posted their **first-ever** video on the **same day** they joined? **Output:** single row, `members_first_video_same_day` **Task C2.** For each `country`, how many **distinct members** have posted at least one video longer than **60 seconds**? **Output:** `country, num_members` --- ## Dataset D — Article views and types ### Table: `article_views` | column | type | notes | |---|---|---| | user_id | BIGINT | viewer | | article_id | BIGINT | viewed article | | date | DATE | view date | ### Table: `articles` | column | type | notes | |---|---|---| | article_id | BIGINT | primary key | | article_type | STRING | e.g., `sports`, `tech` | **Task D1.** For date **2019-01-01**, compute the number of **distinct article types** each user viewed. **Output:** `user_id, num_article_types` **Task D2.** Build a histogram of `num_article_types` across users (i.e., how many users viewed 1 type, 2 types, etc.). **Output:** `num_article_types, num_users`

Quick Answer: This question evaluates a candidate's competency in SQL data manipulation, covering ranking/top-N queries, aggregations and percentage calculations, temporal state reconstruction from action logs, and detection of first-occurrence events.

Quick Overview

This question evaluates a candidate's competency in SQL data manipulation, covering ranking/top-N queries, aggregations and percentage calculations, temporal state reconstruction from action logs, and detection of first-occurrence events.