PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/ML System Design/PayPal

Design RL-based spending limit policy

Last updated: Mar 29, 2026

Quick Overview

This question evaluates proficiency in designing reinforcement-learning-based decision systems for per-user spending limits, examining MDP formulation, state/action/reward specification, safety constraints, off-policy evaluation, and deployment considerations within the payments and risk domain.

  • hard
  • PayPal
  • ML System Design
  • Machine Learning Engineer

Design RL-based spending limit policy

Company: PayPal

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: hard

Interview Round: Onsite

You are to set per-user spending limits using reinforcement learning. Define the environment: state representation (customer, risk, and context features), action space (limit adjustments), transition dynamics, and reward signal (e.g., profit, credit losses, user satisfaction, regulatory penalties). Explain training approach (offline RL or contextual bandits from logged data), exploration strategy under risk constraints, off-policy evaluation, safety guardrails, and how to handle cold start and non-stationarity.

Quick Answer: This question evaluates proficiency in designing reinforcement-learning-based decision systems for per-user spending limits, examining MDP formulation, state/action/reward specification, safety constraints, off-policy evaluation, and deployment considerations within the payments and risk domain.

Related Interview Questions

  • Design a traditional fraud detection system - PayPal (hard)
  • Detect credit-card transaction fraud - PayPal (hard)
  • Design fraud detection from raw transactions - PayPal (hard)
PayPal logo
PayPal
Sep 6, 2025, 12:00 AM
Machine Learning Engineer
Onsite
ML System Design
1
0

RL System Design: Per‑User Spending Limits

You are designing a reinforcement learning (RL) system to set per-user spending limits in a payments/risk context. The goal is to balance revenue and user experience against fraud/credit losses and regulatory compliance.

Task

Define and justify the RL formulation and training/deployment approach:

  1. Environment/MDP
    • State representation: What customer, risk, and context features are included? How are they featurized and updated over time?
    • Action space: How are spending limit decisions represented (e.g., absolute limit vs. incremental adjustments; discrete vs. continuous)? Include any action masks.
    • Transition dynamics: What drives state evolution and partial observability? How does the policy influence future states and outcomes?
    • Reward signal: Specify the components (e.g., profit, expected credit/fraud losses, user satisfaction/friction, regulatory penalties) and how you aggregate/discount them.
  2. Training approach
    • Describe how to use logged historical decisions to train: offline RL vs. contextual bandits. When would you pick each?
  3. Exploration under risk constraints
    • Propose an exploration strategy that respects hard safety constraints while still learning.
  4. Off‑policy evaluation (OPE)
    • How will you evaluate candidate policies before online deployment, including sequential and bandit cases?
  5. Safety guardrails
    • Define policy- and system‑level controls that prevent harmful actions and enable safe rollout.
  6. Cold start
    • How will you handle new users or merchants with little or no history?
  7. Non‑stationarity
    • How will you detect and adapt to distribution shifts (seasonality, new fraud patterns, macro shocks)?
  8. Deployment
    • Outline a cautious rollout plan and real‑time monitoring for this RL system.

Solution

Show

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More PayPal•More Machine Learning Engineer•PayPal Machine Learning Engineer•PayPal ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.