Write SQL for influence score and follower growth
Company: Roblox
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You are working on a social product with these tables:
### Tables / Schemas
**users**
- `user_id` BIGINT (PK)
- `created_at` TIMESTAMP
**posts**
- `post_id` BIGINT (PK)
- `author_id` BIGINT (FK → users.user_id)
- `created_at` TIMESTAMP
**engagement** (one row per engagement event)
- `engagement_id` BIGINT (PK)
- `post_id` BIGINT (FK → posts.post_id)
- `viewer_id` BIGINT (FK → users.user_id)
- `event_type` STRING (e.g., `like`, `comment`, `share`)
- `created_at` TIMESTAMP
**follows** (directed edge)
- `follower_id` BIGINT (FK → users.user_id)
- `followee_id` BIGINT (FK → users.user_id)
- `created_at` TIMESTAMP (when the follow happened)
Assume timestamps are in UTC, and “month” means `DATE_TRUNC('month', created_at)`.
---
## 1) Influence score
Define a user’s **influence score** as:
\[
\text{influence_score}(u) = \#\text{engagement events received on u's posts} + \#\text{followers of u}
\]
Write a SQL query returning:
- `user_id`
- `influence_score`
Notes:
- Engagement “received” means engagement rows on posts where `posts.author_id = user_id`.
- Followers count is the number of distinct `follower_id` where `followee_id = user_id`.
---
## 2) Monthly follower growth
For each user (as the followee), compute monthly follower growth metrics.
Write a SQL query that outputs, for each `followee_id` and month:
- `month`
- `followee_id`
- `new_followers` (distinct new followers gained that month)
- `prev_month_new_followers` (previous month’s `new_followers` using `LAG`)
- `mom_growth` (month-over-month growth rate: `(new_followers - prev_month_new_followers) / prev_month_new_followers`, handling divide-by-zero)
- `cumulative_followers` (running total across months using a window `SUM`)
Assume unfollows are not tracked (so follower counts only increase).
Quick Answer: This question evaluates SQL data manipulation competencies such as joins, aggregations, distinct counts, window functions, and time-based grouping for computing an influence score (engagements on a user's posts plus follower count) and monthly follower growth, and it falls under Data Manipulation (SQL/Python).