Calculate survey response and quality metrics in SQL

Q: How do I practice SQL interview questions?

PracHub provides an interactive SQL console where you can write and test queries against real database schemas. Get instant feedback and compare your solution with the expected output.

Q: What difficulty level is this coding question?

This is a medium difficulty Data Manipulation (SQL/Python) question, commonly asked during Technical Screen rounds at Meta.

Q: What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Meta during technical interviews.

Question

Compute survey response-rate and quality metrics from event data. Assume "today" = 2025-09-01, and compute over the last 7 days (2025-08-26 to 2025-09-01, inclusive).

Schema (invented):
- impressions(impression_id BIGINT, user_id BIGINT, survey_id INT, ts TIMESTAMP)
- clicks(click_id BIGINT, impression_id BIGINT, ts TIMESTAMP)
- surveys(survey_id INT, version VARCHAR, active_from DATE)
- survey_responses(response_id BIGINT, user_id BIGINT, survey_id INT, score INT, ts TIMESTAMP)

Sample data (minimal):
impressions
| impression_id | user_id | survey_id | ts |
| 1 | 101 | 10 | 2025-08-30 10:00:00 |
| 2 | 102 | 10 | 2025-08-30 10:05:00 |
| 3 | 101 | 11 | 2025-08-31 12:00:00 |
| 4 | 103 | 10 | 2025-09-01 09:00:00 |
| 5 | 104 | 10 | 2025-09-01 09:05:00 |

clicks
| click_id | impression_id | ts |
| 1 | 1 | 2025-08-30 10:01:00 |
| 2 | 1 | 2025-08-30 10:02:00 |
| 3 | 3 | 2025-08-31 12:05:00 |
| 4 | 5 | 2025-09-01 09:06:00 |

surveys
| survey_id | version | active_from |
| 10 | v1 | 2025-08-01 |
| 11 | v2 | 2025-08-25 |

survey_responses
| response_id | user_id | survey_id | score | ts |
| 1 | 101 | 10 | 4 | 2025-08-30 10:03:00 |
| 2 | 101 | 10 | 5 | 2025-08-30 10:10:00 |
| 3 | 101 | 11 | 3 | 2025-08-31 12:06:00 |
| 4 | 104 | 10 | 2 | 2025-09-01 09:07:00 |

Tasks (write precise SQL; assume ANSI SQL syntax):
1) Daily survey response rate by survey version: response_rate = unique_clicked_impressions / total_impressions per calendar date in the 7-day window. Count at most 1 click per impression. Join to surveys to report by version.
2) Overall response rate for the full 7-day window (single number) and the same metric broken out by survey version.
3) Quality metrics by survey version for the same window, using the provided score column in two ways:
 a) First-score-only: for each user+survey_id, take their chronologically first score in the window; average across those user+survey pairs.
 b) All-scores: average across all scores in the window (i.e., multiple scores from the same user count multiple times).
4) Ensure that scores are associated with impressions/clicks from the same window. If a response has no corresponding impression within the window, exclude it. If multiple impressions exist for the same user+survey_id, treat responses as eligible if they occur at or after the first click for that user+survey_id on that day.
5) Return: date, survey_version, total_impressions, unique_clicked_impressions, response_rate, avg_score_first_only, avg_score_all.
State any assumptions you make explicitly (e.g., timezone, handling of missing joins).

PracHub · Accepted Answer

This question evaluates SQL proficiency in computing time-windowed metrics, including deduplication, joins between event tables, aggregation, timestamp alignment, and data-quality handling for survey response and score calculations.

Quick Overview

Quick Overview