PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Morgan Stanley

Answer basic probability and statistics questions

Last updated: Mar 29, 2026

Quick Overview

This question evaluates core probability and statistics competencies, including interpretation of Poisson rate parameters from PMF shapes, identification of study bias types (such as survivorship or sampling bias), and properties of covariance and Pearson correlation, framed for a Machine Learning / Data Scientist context.

  • medium
  • Morgan Stanley
  • Machine Learning
  • Data Scientist

Answer basic probability and statistics questions

Company: Morgan Stanley

Role: Data Scientist

Category: Machine Learning

Difficulty: medium

Interview Round: Take-home Project

You are given several short, independent probability and statistics questions similar to those in a data / ML screening test. Answer all sub-questions. --- ### 1. Ordering Poisson distributions by their rate parameter You are told that three different Poisson-distributed random variables \(X_A, X_B, X_C\) have their probability mass functions (PMFs) plotted on the same graph (support on non-negative integers). The plots are **described** as follows: - **Distribution A**: Most of its probability mass is concentrated on values 0, 1, and 2. The mode (highest bar) is at 1. The probability at 0 is high, and probabilities drop off quickly after 3. - **Distribution B**: The mode is at 3. The distribution is more spread out than A: there is still noticeable probability up to around 6 or 7, but very little after that. - **Distribution C**: The mode is at 5. The distribution is the most spread-out of the three, with noticeable probability from around 2 up to 10 or more. All three are Poisson distributions with parameters \(\lambda_A\), \(\lambda_B\), and \(\lambda_C\) respectively. **Question 1:** Based on the qualitative description of the plots, order the three rate parameters from smallest to largest: - (a) \(\lambda_A\), \(\lambda_B\), \(\lambda_C\) - (b) \(\lambda_C\), \(\lambda_B\), \(\lambda_A\) - (c) \(\lambda_A\), \(\lambda_C\), \(\lambda_B\) - (d) \(\lambda_B\), \(\lambda_A\), \(\lambda_C\) Pick the correct ordering and briefly justify your choice. --- ### 2. Identifying type of bias in a study A startup incubator wants to understand “what makes startups successful.” They collect data only from companies that have already raised Series C or later funding and are still operating. They analyze features such as team size, prior founder experience, average age of founders, and industry, and then publish a report claiming: “These are the characteristics that make startups successful.” You are asked: **What is the primary type of bias in this study?** Choose the best option and briefly explain. - (a) Survivor (survivorship) bias - (b) Sampling bias (non-representative sample) - (c) Recall bias - (d) No bias; the study design is appropriate (You may mention more than one kind of bias if relevant, but identify the primary/statistically standard name.) --- ### 3. True/False questions on covariance and correlation For each of the following statements about covariance and correlation between two real-valued random variables \(X\) and \(Y\), answer **True** or **False** and provide a brief justification. 1. **Statement A:** If \(\text{Cov}(X, Y) = 0\), then \(X\) and \(Y\) are independent. 2. **Statement B:** The Pearson correlation coefficient \(\rho_{XY}\) is always between \(-1\) and \(1\), inclusive. 3. **Statement C:** The Pearson correlation coefficient between \(X\) and \(Y\) is given by \[ \rho_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}, \] where \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X\) and \(Y\), respectively. 4. **Statement D:** If we rescale \(X\) by a positive constant \(a > 0\), i.e., define \(X' = aX\), then the correlation between \(X'\) and \(Y\) is the same as the correlation between \(X\) and \(Y\). 5. **Statement E:** A very high correlation between \(X\) and \(Y\) implies that \(X\) causes changes in \(Y\). State True/False for each and justify in one or two sentences.

Quick Answer: This question evaluates core probability and statistics competencies, including interpretation of Poisson rate parameters from PMF shapes, identification of study bias types (such as survivorship or sampling bias), and properties of covariance and Pearson correlation, framed for a Machine Learning / Data Scientist context.

Related Interview Questions

  • Explain Chunking for Financial RAG - Morgan Stanley (medium)
  • Describe algorithm to find function maximum - Morgan Stanley (medium)
  • Explain futures pricing and linear regression basics - Morgan Stanley (hard)
Morgan Stanley logo
Morgan Stanley
Aug 23, 2025, 12:00 AM
Data Scientist
Take-home Project
Machine Learning
2
0
Loading...

You are given several short, independent probability and statistics questions similar to those in a data / ML screening test. Answer all sub-questions.

1. Ordering Poisson distributions by their rate parameter

You are told that three different Poisson-distributed random variables XA,XB,XCX_A, X_B, X_CXA​,XB​,XC​ have their probability mass functions (PMFs) plotted on the same graph (support on non-negative integers). The plots are described as follows:

  • Distribution A : Most of its probability mass is concentrated on values 0, 1, and 2. The mode (highest bar) is at 1. The probability at 0 is high, and probabilities drop off quickly after 3.
  • Distribution B : The mode is at 3. The distribution is more spread out than A: there is still noticeable probability up to around 6 or 7, but very little after that.
  • Distribution C : The mode is at 5. The distribution is the most spread-out of the three, with noticeable probability from around 2 up to 10 or more.

All three are Poisson distributions with parameters λA\lambda_AλA​, λB\lambda_BλB​, and λC\lambda_CλC​ respectively.

Question 1: Based on the qualitative description of the plots, order the three rate parameters from smallest to largest:

  • (a) λA\lambda_AλA​ , λB\lambda_BλB​ , λC\lambda_CλC​
  • (b) λC\lambda_CλC​ , λB\lambda_BλB​ , λA\lambda_AλA​
  • (c) λA\lambda_AλA​ , λC\lambda_CλC​ , λB\lambda_BλB​
  • (d) λB\lambda_BλB​ , λA\lambda_AλA​ , λC\lambda_CλC​

Pick the correct ordering and briefly justify your choice.

2. Identifying type of bias in a study

A startup incubator wants to understand “what makes startups successful.” They collect data only from companies that have already raised Series C or later funding and are still operating. They analyze features such as team size, prior founder experience, average age of founders, and industry, and then publish a report claiming: “These are the characteristics that make startups successful.”

You are asked: What is the primary type of bias in this study? Choose the best option and briefly explain.

  • (a) Survivor (survivorship) bias
  • (b) Sampling bias (non-representative sample)
  • (c) Recall bias
  • (d) No bias; the study design is appropriate

(You may mention more than one kind of bias if relevant, but identify the primary/statistically standard name.)

3. True/False questions on covariance and correlation

For each of the following statements about covariance and correlation between two real-valued random variables XXX and YYY, answer True or False and provide a brief justification.

  1. Statement A: If Cov(X,Y)=0\text{Cov}(X, Y) = 0Cov(X,Y)=0 , then XXX and YYY are independent.
  2. Statement B: The Pearson correlation coefficient ρXY\rho_{XY}ρXY​ is always between −1-1−1 and 111 , inclusive.
  3. Statement C: The Pearson correlation coefficient between XXX and YYY is given by

ρXY=Cov(X,Y)σXσY,\rho_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y},ρXY​=σX​σY​Cov(X,Y)​,

where σX\sigma_XσX​ and σY\sigma_YσY​ are the standard deviations of XXX and YYY, respectively. 4. Statement D: If we rescale XXX by a positive constant a>0a > 0a>0, i.e., define X′=aXX' = aXX′=aX, then the correlation between X′X'X′ and YYY is the same as the correlation between XXX and YYY. 5. Statement E: A very high correlation between XXX and YYY implies that XXX causes changes in YYY.

State True/False for each and justify in one or two sentences.

Solution

Show

Submit Your Answer

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Morgan Stanley•More Data Scientist•Morgan Stanley Data Scientist•Morgan Stanley Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.