PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates SQL-based data manipulation skills focused on event-time analytics, time-windowed aggregations, temporal joins, handling missing or multiple event rows, and computing deliverability metrics; it is categorized under Data Manipulation (SQL/Python) for a Data Scientist role.

  • Medium
  • Microsoft
  • Data Manipulation (SQL/Python)
  • Data Scientist

Query email logs for deliverability insights

Company: Microsoft

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Using the schema and sample data below, write SQL to answer parts a–c. Assume timestamps are UTC. Today is 2025-09-01. Schema: - Emails(sent_id INT, message_id TEXT, provider TEXT, from_addr TEXT, to_domain TEXT, sent_ts TIMESTAMP, subject TEXT) - Deliveries(message_id TEXT, status TEXT, event_ts TIMESTAMP, detail TEXT) Sample data: Emails | sent_id | message_id | provider | from_addr | to_domain | sent_ts | subject | | 1 | m1 | Gmail | candidate@gmail.com | ms.com | 2025-08-30 09:00 | avail slots | | 2 | m2 | Gmail | candidate@gmail.com | ms.com | 2025-08-31 10:00 | follow-up | | 3 | m3 | Outlook | candidate@outlook.com | ms.com | 2025-08-31 10:05 | follow-up alt | | 4 | m4 | Gmail | candidate@gmail.com | ms.com | 2025-09-01 08:00 | confirmation | | 5 | m5 | Outlook | candidate@outlook.com | ms.com | 2025-08-26 20:00 | initial reach | | 6 | m6 | Gmail | candidate@gmail.com | otherco.com| 2025-08-27 09:00 | sanity check | Deliveries | message_id | status | event_ts | detail | | m1 | bounced | 2025-08-30 09:01 | 550 5.7.1 | | m1 | delivered | 2025-08-30 09:20 | | | m3 | delivered | 2025-08-31 10:06 | | | m4 | spam_folder | 2025-09-01 08:00 | filtered | | m5 | delivered | 2025-08-26 20:01 | | | m6 | delivered | 2025-08-27 09:01 | | Note: message_id m2 has no rows in Deliveries (treat as undelivered). Tasks: (a) For emails sent in the last 7 days relative to 2025-09-01 (window 2025-08-26 through 2025-09-01 inclusive), compute by provider and to_domain the proportion delivered within 5 minutes of sent_ts. Rules: use the earliest event with status = 'delivered'; ignore 'spam_folder'; treat messages with no delivery rows as undelivered; if a message bounces then later delivers, count it as delivered only if the delivered event occurs within 5 minutes of sent_ts. (b) Return all message_ids where the first observed event was a bounce and a later delivered event occurred more than 5 minutes after sent_ts; include the lag in minutes between sent_ts and the first delivered event. (c) For domain ms.com over the same window, determine the earliest calendar date on which switching from Gmail to Outlook would have strictly improved deliverability, based on same-day cumulative delivered-within-5-minutes rates up to each send. Output that date and show the cumulative rates used to justify the switch.

Quick Answer: This question evaluates SQL-based data manipulation skills focused on event-time analytics, time-windowed aggregations, temporal joins, handling missing or multiple event rows, and computing deliverability metrics; it is categorized under Data Manipulation (SQL/Python) for a Data Scientist role.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Query departments and top earners - Microsoft (easy)
  • Find common friends from directed edges - Microsoft (Medium)
  • Compute most popular location with weights - Microsoft (Medium)
  • Count words in a document robustly - Microsoft (Medium)