You are interviewing for a data role and are asked several probability/distribution questions.
1) Bank queue choice (queueing intuition)
A bank has 5 tellers.
-
Option A:
1 single line feeding the next available teller (1 queue → 5 servers).
-
Option B:
5 separate lines, one per teller (5 queues → 5 servers).
Question: Which option would you choose to minimize your waiting time? Explain in terms of expected waiting time and variability, and state any assumptions you need (arrival process, service-time distribution, customer behavior like line switching).
2) Sketch common real-world distributions
(a) Heights
Sketch the distribution of:
-
Adult men’s heights in the US
-
Adult women’s heights in the US
-
The combined distribution when you mix men and women
Describe the likely shape (e.g., unimodal/bimodal), approximate symmetry/skew, and what assumptions justify that.
(b) LinkedIn connections
Let each user have a number of connections (e.g., 100, 200, 500, …).
-
Sketch the distribution of
number of connections per user
.
-
Give a plausible
range for the mean
(order-of-magnitude is fine) and explain what data properties drive it.
-
For this distribution, compare
mean vs. median vs. mode
(which is largest?) and explain why.