PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates proficiency in event-level data aggregation, sessionization, windowing, and stateful stream processing across SQL and Python, including computing aggregated metrics such as total watch duration and peak screen coverage.

  • Medium
  • Meta
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute Effective Reads with SQL and Python Streaming

Company: Meta

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

post_read_events +----------+---------+-----------+---------+--------------------+---------------------+---------------------+ | event_id | user_id | session_id| post_id | watch_duration_sec | screen_coverage_pct | event_timestamp | +----------+---------+-----------+---------+--------------------+---------------------+---------------------+ | 1 | 101 | s1 | 501 | 2.5 | 75 | 2023-06-01 10:00:01 | | 2 | 101 | s1 | 501 | 3.0 | 80 | 2023-06-01 10:00:05 | | 3 | 102 | s2 | 502 | 6.0 | 65 | 2023-06-01 10:01:09 | | 4 | 101 | s3 | 503 | 1.5 | 90 | 2023-06-01 11:15:21 | | 5 | 102 | s2 | 502 | 4.0 | 70 | 2023-06-01 10:02:15 | +----------+---------+-----------+---------+--------------------+---------------------+---------------------+ ##### Scenario Event-level log of post impressions for a newsfeed; need to compute effective reads in batch (SQL) and in near-real-time (Python streaming). ##### Question Write a SQL query that returns every post_id whose total watch_duration_sec across all views exceeds X seconds AND whose maximum screen_coverage_pct across those views exceeds Y. Using the same data consumed as a stream, write Python code that groups events into sessions (end a session if no event arrives for ≥30 s), then emits, for each session, the posts that satisfy the same effective-read criteria along with their total watch time and max screen coverage. ##### Hints Window functions or GROUP BY for SQL; for Python, think generators, stateful dictionaries, and a session timeout watermark.

Quick Answer: This question evaluates proficiency in event-level data aggregation, sessionization, windowing, and stateful stream processing across SQL and Python, including computing aggregated metrics such as total watch duration and peak screen coverage.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Compute ad impression conversion rates - Meta (medium)
  • Count unconnected posts and reactions - Meta (medium)
  • Count heavy callers in 7 days - Meta (medium)
  • Write SQL for call metrics - Meta (medium)
  • Write SQL for multi-account metrics - Meta (medium)