PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/Airbnb

Design a customer LTV prediction system

Last updated: Jun 18, 2026

Quick Overview

This question evaluates end-to-end ML system design competencies, including business-driven label definition, feature engineering and point-in-time correctness, cold-start strategies, modeling and uncertainty estimation, temporal training/validation, evaluation metrics, and production serving and monitoring within the domain of machine learning system design and data engineering. It is commonly asked to assess an engineer's ability to translate business LTV requirements into robust, production-ready ML solutions that handle censoring, non-stationarity, and operational constraints, testing both conceptual understanding of trade-offs and practical application of engineering patterns for training, validation, and serving.

  • hard
  • Airbnb
  • ML System Design
  • Software Engineer

Design a customer LTV prediction system

Company: Airbnb

Role: Software Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Onsite

Design an end-to-end machine learning system to estimate customer lifetime value (LTV) on a platform. Define what LTV means for the business (e.g., revenue, margin, or contribution after costs) and the prediction horizon. Describe data sources, feature pipelines, handling cold-start users, label construction, modeling approach (e.g., survival analysis, churn/retention modeling, purchase frequency and monetary value), training/validation splits, and evaluation metrics. Propose the offline/online architecture for batch scoring and near-real-time updates, including data freshness, backfills, monitoring, and alerting. Explain how scores feed downstream decisions (marketing, incentives, recommendations), how you would run experiments to measure impact, and how you would address bias, privacy, and regulatory concerns. If time is limited, you may skip detailed online serving.

Quick Answer: This question evaluates end-to-end ML system design competencies, including business-driven label definition, feature engineering and point-in-time correctness, cold-start strategies, modeling and uncertainty estimation, temporal training/validation, evaluation metrics, and production serving and monitoring within the domain of machine learning system design and data engineering. It is commonly asked to assess an engineer's ability to translate business LTV requirements into robust, production-ready ML solutions that handle censoring, non-stationarity, and operational constraints, testing both conceptual understanding of trade-offs and practical application of engineering patterns for training, validation, and serving.

Related Interview Questions

  • Design a dynamic rental pricing system - Airbnb (hard)
Airbnb logo
Airbnb
Aug 12, 2025, 12:00 AM
Software Engineer
Onsite
ML System Design
9
0

System Design: End-to-End ML for Customer Lifetime Value (LTV)

Context

You are designing an end-to-end machine learning system to estimate customer lifetime value (LTV) for a large two-sided marketplace platform. Assume we are focusing on the demand side (guest/customer LTV) unless you prefer to discuss both sides; state your scope explicitly.

Requirements

Define and design the full stack from business definition and labels through modeling, evaluation, and serving. Cover the following:

  1. Business Definition
  • Precisely define LTV for this business (e.g., revenue, gross margin, contribution after variable costs). Specify which costs are included/excluded.
  • Specify the prediction horizon (e.g., 6, 12, or 24 months) and whether to discount future cash flows. State the discount rate if used.
  • Clarify scope (e.g., guest LTV only) and any exclusions (e.g., fraudulent activity, chargebacks).
  1. Data and Features
  • Enumerate data sources: bookings/transactions, cancellations/refunds, payments/fees, marketing touchpoints, user profiles/consents, search/browse events, messaging/funnel, support interactions, risk decisions, incentives, and cost tables.
  • Describe feature pipelines: aggregation windows (e.g., 7/30/90/365 days), RFM-style features, recency of activity, seasonality, geo/device, marketing channel, quality signals, and marketplace context (e.g., supply-demand).
  • Point-in-time correctness and leakage prevention (e.g., event-time joins, freeze windows). Identity resolution and PII handling.
  1. Cold-Start Strategy
  • How to score new or nearly-new users (no bookings or very sparse history). Consider priors, hierarchical grouping, and context-based features.
  1. Label Construction
  • Define the target formula precisely, including how to handle cancellations, refunds, incentives, and payment processing costs.
  • Discuss horizon alignment, censoring (users without full observation windows), and maturity/freeze windows for late-arriving data.
  1. Modeling Approach
  • Propose and justify a modeling strategy (e.g., survival/retention modeling, purchase frequency and monetary value decomposition, count models, direct regression, or mixture).
  • Note uncertainty estimation and calibration if applicable.
  1. Training/Validation
  • Specify temporal train/validation/test splits (rolling windows/backtesting). Address class/label imbalance and non-stationarity.
  1. Evaluation Metrics
  • Include regression error (e.g., MAE/RMSE/sMAPE), ranking/segment metrics (e.g., decile lift, top-k capture), calibration, and business metrics (profit at policy).
  1. Serving Architecture
  • Propose offline/online architecture for batch scoring and near-real-time updates.
  • Cover data freshness SLAs, snapshotting/backfills, point-in-time correctness, and monitoring/alerting (data quality, drift, performance, business KPIs).
  • If time is limited, you may skip detailed online serving.
  1. Downstream Use Cases and Experimentation
  • Explain how scores feed decisions (e.g., marketing budget/CPA bidding, incentives, recommendations/ranking, CRM).
  • Outline experimentation to measure impact, including interference/marketplace considerations.
  1. Risk, Bias, Privacy, and Compliance
  • Discuss how you would address model bias/fairness, privacy (consent, minimization, deletion), and regulatory requirements (e.g., GDPR/CCPA).

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Airbnb•More Software Engineer•Airbnb Software Engineer•Airbnb ML System Design•Software Engineer ML System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.