PracHub
QuestionsPremiumLearningGuidesInterview PrepNEWCoaches
|Home/Data Manipulation (SQL/Python)/Meta

Define and analyze new-vs-existing activity

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's competency in cohort definition, time-series event aggregation, use of window functions, cohort-level weighting and bias-aware analytics using SQL and Python.

  • Medium
  • Meta
  • Data Manipulation (SQL/Python)
  • Data Scientist

Define and analyze new-vs-existing activity

Company: Meta

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Onsite

Ambiguous product question: Are existing users more active than new users over the last 28 days (ending today = 2025-09-01)? 1) Propose two reasonable, mutually exclusive definitions for existing vs new (e.g., by signup date or by prior activity), and two defensible definitions of active (e.g., DAU, sessions/week). Briefly state pros/cons and pick one pair to implement. 2) Using the schema and sample data below, write SQL that: a) labels users as new or existing; b) computes each cohort's 7-day rolling active rate and average daily events/user over the last 28 days; c) adjusts for partial observation windows for users who signed up within the window; d) produces a final table with date, cohort, active_users, total_users_observed, active_rate_7d, avg_events_per_user. Use window functions (e.g., partition by user, rolling windows) and avoid double-counting users across cohorts. 3) Extend your query to stratify by country and then produce a cohort-level weighted average controlling for country mix. 4) Briefly note two bias risks (e.g., survivorship, seasonality) and one SQL-side mitigation you implemented. Schema (you may add a small date calendar CTE if needed): users(user_id INT, signup_date DATE, country STRING) events(user_id INT, event_date DATE, event_type STRING) Sample rows: users user_id | signup_date | country 1 | 2025-08-15 | US 2 | 2025-06-10 | US 3 | 2025-08-30 | CA 4 | 2025-07-01 | IN 5 | 2025-08-20 | US events user_id | event_date | event_type 1 | 2025-08-29 | view 1 | 2025-09-01 | message 2 | 2025-08-25 | like 2 | 2025-08-31 | view 3 | 2025-09-01 | view 4 | 2025-08-28 | view 5 | 2025-08-31 | comment

Quick Answer: This question evaluates a candidate's competency in cohort definition, time-series event aggregation, use of window functions, cohort-level weighting and bias-aware analytics using SQL and Python.

Related Interview Questions

  • Compute ad impression conversion rates - Meta (medium)
  • Count unconnected posts and reactions - Meta (medium)
  • Count heavy callers in 7 days - Meta (medium)
  • Write SQL for call metrics - Meta (medium)
  • Write SQL for multi-account metrics - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Onsite
Data Manipulation (SQL/Python)
2
0

Ambiguous product question: Are existing users more active than new users over the last 28 days (ending today = 2025-09-01)? 1) Propose two reasonable, mutually exclusive definitions for existing vs new (e.g., by signup date or by prior activity), and two defensible definitions of active (e.g., DAU, sessions/week). Briefly state pros/cons and pick one pair to implement. 2) Using the schema and sample data below, write SQL that: a) labels users as new or existing; b) computes each cohort's 7-day rolling active rate and average daily events/user over the last 28 days; c) adjusts for partial observation windows for users who signed up within the window; d) produces a final table with date, cohort, active_users, total_users_observed, active_rate_7d, avg_events_per_user. Use window functions (e.g., partition by user, rolling windows) and avoid double-counting users across cohorts. 3) Extend your query to stratify by country and then produce a cohort-level weighted average controlling for country mix. 4) Briefly note two bias risks (e.g., survivorship, seasonality) and one SQL-side mitigation you implemented. Schema (you may add a small date calendar CTE if needed): users(user_id INT, signup_date DATE, country STRING) events(user_id INT, event_date DATE, event_type STRING) Sample rows: users user_id | signup_date | country 1 | 2025-08-15 | US 2 | 2025-06-10 | US 3 | 2025-08-30 | CA 4 | 2025-07-01 | IN 5 | 2025-08-20 | US

events user_id | event_date | event_type 1 | 2025-08-29 | view 1 | 2025-09-01 | message 2 | 2025-08-25 | like 2 | 2025-08-31 | view 3 | 2025-09-01 | view 4 | 2025-08-28 | view 5 | 2025-08-31 | comment

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Meta•More Data Scientist•Meta Data Scientist•Meta Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.