PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches

Quick Overview

This question evaluates the ability to manipulate event-level logs to compute cohort metrics—specifically signup rate and D30 retention—testing proficiency in SQL/Python, event-time windowing, joins, aggregation, and understanding of upstream logging semantics.

  • easy
  • OpenAI
  • Data Manipulation (SQL/Python)
  • Data Scientist

Compute signup rate and retention from raw logs

Company: OpenAI

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: easy

Interview Round: Technical Screen

## Scenario You are analyzing an A/B test for a marketing campaign offering **a free 1-month trial**. You are given raw “upstream” tables that resemble production event logs rather than pre-aggregated metrics. Assume **UTC timestamps** and that each user has **at most one assignment**. ## Tables (upstream) ### `experiment_assignment` - `user_id` (STRING, PK) - `experiment_id` (STRING) - `variant` (STRING; 'control' or 'treatment') - `assigned_at` (TIMESTAMP) ### `offer_exposure_events` - `user_id` (STRING) - `exposed_at` (TIMESTAMP) - `campaign_id` (STRING) ### `subscription_events` - `user_id` (STRING) - `event_time` (TIMESTAMP) - `event_type` (STRING; one of 'trial_start', 'paid_start', 'cancel') ### `app_sessions` - `user_id` (STRING) - `session_start` (TIMESTAMP) ## Metric definitions 1) **Signup rate (trial start within 7 days)** - Denominator: users assigned to the experiment. - Numerator: users with a `trial_start` event where `event_time` is in `[assigned_at, assigned_at + 7 days)`. 2) **D30 activity retention** Among users who started a trial within 7 days, the user is retained if they have **≥ 1 app session** with `session_start` in `[assigned_at + 30 days, assigned_at + 37 days)`. ## Task Write SQL to output one row per variant with: - `variant` - `assigned_users` - `signup_users_7d` - `signup_rate_7d` - `retained_users_d30` - `retention_rate_d30` Also briefly describe what an “upstream” dataset/logging pipeline must contain to compute these metrics reliably (e.g., assignment logs, exposure logs, subscription lifecycle events, identity consistency).

Quick Answer: This question evaluates the ability to manipulate event-level logs to compute cohort metrics—specifically signup rate and D30 retention—testing proficiency in SQL/Python, event-time windowing, joins, aggregation, and understanding of upstream logging semantics.

Last updated: Mar 29, 2026

Loading coding console...

PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.

Related Coding Questions

  • Write SQL for repeat churn - OpenAI (hard)
  • Handle repeated churn in SQL - OpenAI (hard)
  • Compute churn with re-subscriptions - OpenAI (hard)
  • Debug and harden trial-assignment Python code - OpenAI (Medium)
  • Write SQL for post-trial conversion cohorts - OpenAI (Medium)