Compare queueing systems and common distributions
Company: LinkedIn
Role: Data Scientist
Category: Statistics & Math
Difficulty: medium
Interview Round: Technical Screen
##### Question
You are asked a series of statistics fundamentals questions in a data science technical screen.
1. **Queueing.** A bank with 5 tellers can be organized two ways:
- **System A:** all 5 tellers share **one common queue**.
- **System B:** each teller has **their own separate queue**.
Assume customer arrivals are random, tellers are similarly (but not necessarily identically) skilled, customers are served first-come-first-served within a line, and some customers take much longer than others. As a customer, which system would you rather join, and why? Compare the two in terms of **expected waiting time, variance of waiting time, and fairness.**
2. **Height distributions.** Sketch or describe the distribution of **adult male heights in the United States** and, separately, the distribution of **adult female heights in the United States.**
3. **Pooled distribution.** If you pool men and women together and ignore sex, what does the **combined height distribution** look like?
4. **Network degree.** On a social network such as LinkedIn, describe the distribution of the **number of connections per user.** Is it symmetric, left-skewed, or right-skewed?
5. **Summary statistics.** For that connections distribution, how do the **mean, median, and mode** compare, and why? If asked for the likely scale of the mean, what factors would determine it?
6. **Regularization bias.** Are the **L1-regularized (lasso)** and **L2-regularized (ridge)** estimators unbiased? Why or why not?
Quick Answer: LinkedIn data scientist technical-screen statistics question. It evaluates queueing-theory pooling (one shared queue vs. separate queues, compared on expected wait, variance, and fairness), normal vs. mixture distributions for U.S. heights, the right-skewed heavy-tailed distribution of social-network connections with the mode < median < mean ordering, and why L1 (lasso) and L2 (ridge) estimators are biased via the bias–variance tradeoff.