This question evaluates the ability to compute cohort-based early engagement metrics and Day-30 retention using time-windowed joins, aggregation, bucketing, and handling edge cases such as users with zero or pre-signup transactions.
You are given two tables that track signups and user transactions.
usersuser_id
(STRING,
PK
) — unique user
signup_ts
(TIMESTAMP) — time the user signed up (assume UTC)
transactionstransaction_id
(STRING,
PK
)
user_id
(STRING,
FK → users.user_id
)
transaction_ts
(TIMESTAMP) — time of transaction (UTC)
amount
(NUMERIC) — transaction amount
You want an early signal that predicts Day-30 retention.
early_txn_cnt
= count of
transactions
where
transaction_ts >= signup_ts
and
transaction_ts < signup_ts + 7 days
.
is_retained_d30
∈ {0,1}
early_txn_cnt
into:
0
,
1
,
2
,
3+
Return one row per bucket with:
early_txn_bucket
users_in_bucket
retained_users
retention_rate
(=
retained_users
/
users_in_bucket
, as a decimal)
signup_ts
, ignore them.