Today is 2025-09-01. We need the 14-day conversion rate (CVR14) for impressions served between 2025-08-18 and 2025-09-01, but many conversions occur with unknown delays up to 14 days, so recent impressions are right-censored. You cannot assume any parametric delay distribution.
Tasks:
-
Propose a nonparametric estimator for CVR14 that uses historical cohorts to learn the time-to-convert survival function and applies it to the current, partially observed cohort (e.g., Kaplan–Meier for conversion delay with right-censoring, then inverse-probability weighting to debias the observed-to-date converts). Write formulas for the estimator and indicate the data each term uses.
-
Construct a 95% confidence interval using Greenwood’s formula for the KM variance and the delta method for the transformed CVR, stating assumptions. Explain how you would widen intervals if you suspect non-stationarity of delays.
-
Provide a distribution-free conservative bound for CVR14 that makes minimal assumptions (e.g., DKW inequality on the empirical CDF of delays or Clopper–Pearson on observed conversions plus a worst-case bound for yet-unfinished impressions). Show how to compute it from raw counts available today.
-
Describe diagnostics to check whether the historical delay distribution is applicable now (e.g., compare covariate-shift via PSI/KS tests on traffic mix, day-of-week effects, or device splits) and how to stratify/weight if shift is detected.
-
If you can observe only aggregated daily counts of impressions and same-day conversions (no user-level data), outline an identifiable approach and the additional assumptions required to estimate or bound CVR14.