You are interviewing for a Data Scientist role at a fintech payments company similar to Stripe.
The company is launching a product called Capital, which offers pre-qualified loans to existing merchants on the platform. Eligible merchants are selected using the company's internal merchant data. Repayment is collected automatically as 12% of the merchant's daily processed revenue until the contractual repayment amount is completed.
Assume you have access to merchant transaction history, loan offers, acceptance data, repayment outcomes, disputes and refunds, merchant attributes, and loan-level financial data.
Answer the following:
-
If you were building a dashboard for Capital, what metrics would you include?
-
Be explicit about primary success metrics, guardrail metrics, and how you would segment the dashboard.
-
Consider tradeoffs between growth, repayment performance, portfolio risk, merchant health, and long-term unit economics.
-
How would you decide that a merchant should not receive a pre-qualified loan offer?
-
What early warning signals or predictive features would you gather?
-
How would you define a bad outcome in this setting, given that repayment is tied to daily revenue rather than a fixed installment schedule?
-
Discuss issues such as model calibration, selection bias, and policy rules versus predictive models.
-
Should the company offer multiple loan options to each merchant, or a single loan amount?
-
Discuss the pros and cons from the perspectives of conversion, merchant experience, self-selection, adverse selection, risk management, and operational complexity.
-
If you wanted to test this, how would you design the experiment and choose the evaluation metrics?
-
Suppose Capital profit starts to decrease. How would you diagnose the problem?
-
Provide a structured framework to decompose profit changes.
-
Consider changes in funnel conversion, merchant mix, credit quality, repayment duration, funding cost, pricing, collections, macro conditions, and data quality.
-
Explain how cohort analysis or segmentation could help avoid misleading aggregate conclusions.