PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches

Quick Overview

This question evaluates a candidate's ability to perform time-based user labeling and aggregation in SQL, testing skills such as date arithmetic, joins, deduplication of user-day records, and efficient generation of per-user-per-day labels.

  • Medium
  • Meta
  • Data Manipulation (SQL/Python)
  • Data Scientist

Label new vs old users over time in SQL

Company: Meta

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Define users as “new” during the first 30 days inclusive after their signup_date, and “old” thereafter. Produce per-user, per-day labels over a window, acknowledging that a single user naturally has different labels on different days. Use today = 2025-09-01. Schema: users(user_id INT, signup_date DATE) activity(user_id INT, activity_date DATE) -- any daily activity counts as presence Sample data: users +---------+-------------+ | user_id | signup_date | +---------+-------------+ | 1 | 2025-07-25 | | 2 | 2025-08-20 | | 3 | 2025-08-01 | | 4 | 2025-09-01 | +---------+-------------+ activity +---------+---------------+ | user_id | activity_date | +---------+---------------+ | 1 | 2025-08-02 | | 1 | 2025-08-26 | | 2 | 2025-08-25 | | 2 | 2025-09-01 | | 3 | 2025-08-15 | | 3 | 2025-09-01 | | 4 | 2025-09-01 | +---------+---------------+ Tasks: A) Write SQL that outputs (user_id, d, label) for every day d in [2025-08-02, 2025-09-01], labeling “new” if d between signup_date and signup_date + 29 days inclusive, else “old”. Avoid assigning multiple labels for the same user-day. Prefer generating dates only where activity exists (join to activity) to reduce scan cost. B) Using your labels, compute two aggregates for the last 30 days relative to today (window = [2025-08-03, 2025-09-01]): (i) active_new_users and active_old_users per day; (ii) a single summary row with total distinct users by label over the window. Explain any assumptions about time zones and inclusive/exclusive bounds.

Quick Answer: This question evaluates a candidate's ability to perform time-based user labeling and aggregation in SQL, testing skills such as date arithmetic, joins, deduplication of user-day records, and efficient generation of per-user-per-day labels.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Compute ad impression conversion rates - Meta (medium)
  • Count unconnected posts and reactions - Meta (medium)
  • Count heavy callers in 7 days - Meta (medium)
  • Write SQL for call metrics - Meta (medium)
  • Write SQL for multi-account metrics - Meta (medium)