Market-Making Estimation Game: Optimal Confidence-Interval Strategy Over 5 Rounds
Company: Optiver
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
You are interviewing with a proprietary trading firm, and one of the rounds is a **market-making game** played over 5 rounds. In each round, the interviewer asks you a numerical estimation question — a positive quantity whose exact value you do not know (for example, "How many windows does this office building have?"). You must respond with a closed interval $[L, U]$ (with $0 < L \le U$) that you believe contains the true answer.
Each round is scored as follows:
- If the true answer is **outside** your interval, you score **0** for that round.
- If the true answer is **inside** your interval, you score $L/U$ for that round.
A tighter interval (larger $L$ relative to $U$) earns more when you are right, but is more likely to miss entirely. For example, suppose the true answer is 16. If you quote $[20, 40]$, the answer is outside your interval and you score 0. If you quote $[10, 30]$, the answer is inside and you score $10/30 \approx 0.33$.
Your total score is the sum over the 5 rounds, and you **pass the game if your total is at least 2.0**.
Analyze this game and explain how you would play it.
### Constraints & Assumptions
- There are exactly 5 rounds; the pass condition is on the **sum** of round scores (total $\ge 2.0$), with no per-round minimum.
- All quantities being estimated are positive, so intervals with $L > 0$ are always meaningful and the score $L/U$ lies in $(0, 1]$.
- You do not know any answer exactly; model your uncertainty about each answer as a subjective probability distribution (your belief).
- Assume the 5 questions are of comparable difficulty and your belief quality is similar across rounds, unless you choose to relax this.
- For the adaptive-play discussion, assume you learn whether your interval contained the answer (and hence your running total) after each round.
- You cannot skip a round; you must quote an interval every round.
### Clarifying Questions to Ask
- Is the target a total of 2.0 across all rounds, or is there also a per-round requirement?
- Do I find out whether my interval contained the answer (and my score) after each round, or only at the end?
- Are the answers always strictly positive quantities, so that the $L/U$ score is well defined?
- May I quote a degenerate interval $L = U$ (a point guess scoring 1.0 if exactly right), and are non-integer endpoints allowed?
- Are the 5 questions of similar difficulty, or should I expect some where my uncertainty spans orders of magnitude?
### Part 1: The scoring rule and the required pace
Verify the scoring rule on the example above. Then work out what the 2.0 target actually demands: what average score per round do you need? Notice that when you are right, your score depends only on the **ratio** $U/L$ — so translate the required score into a required interval tightness, and show how that requirement changes depending on how many of the 5 rounds you expect to hit.
```hint Work backwards from the target
$2.0$ over 5 rounds is an average of $0.4$ per round if you never miss. A score of $0.4$ means $L/U = 0.4$, i.e. $U = 2.5\,L$. Now redo the calculation assuming you miss one round, then two — what ratio must the remaining hits achieve?
```
#### What This Part Should Cover
- Correct application of the rule: 0 on a miss, $L/U$ on a hit, as in the worked example.
- The observation that the payoff depends only on the multiplicative ratio $U/L$, not the absolute width or location of the interval.
- A concrete translation of the 2.0 target into per-hit score and maximum interval ratio under different hit counts (5, 4, 3 hits).
### Part 2: Choosing the best interval for a single round
Now fix a single round. Model your uncertainty about the true answer $X$ as a belief distribution. Write down the expected score of quoting $[L, U]$, and show that the choice decomposes into two sub-problems: **where to place** the interval and **how wide** to make it. Characterize the optimal placement and the optimal width, and explain what role your calibration (how well you know your own uncertainty) plays.
```hint Expected score
$\mathbb{E}[\text{score}] = \frac{L}{U} \cdot P(L \le X \le U)$. Substitute $\ell = \ln L$, $u = \ln U$: the payoff factor becomes $e^{-(u-\ell)}$, a function of the **log-width** alone. This game is multiplicative — work in log space.
```
```hint Fix the width first
For a fixed log-width $w$, placement only affects the coverage term, so the best placement is the highest-probability window of width $w$ for $\ln X$. Then you are left with a one-dimensional trade-off: maximize $e^{-w} \, p(w)$ over $w$, where $p(w)$ is the best achievable coverage at width $w$.
```
#### What This Part Should Cover
- A correct expected-score expression and the log-space reformulation (payoff $e^{-w}$ vs coverage $p(w)$).
- Optimal placement as the highest-density log-width-$w$ window of the belief — geometric centering around the median for symmetric log-space beliefs.
- The width trade-off: tighter quotes raise the payoff on a hit but lower coverage, with the optimum balancing the two.
- The insight that the achievable score is capped by how uncertain you actually are — the game rewards calibration, not bravado.
### Part 3: Playing the full 5-round game to reach 2.0
The pass condition is a threshold: what matters is $P(\text{total} \ge 2.0)$, not the expected total. First analyze **fixed strategies**: if you quote the same ratio $r = U/L$ every round (scoring $1/r$ per hit) with per-round hit probability $p(r)$, how many hits do you need and what is your pass probability? Compare a few tightness levels. Then describe **adaptive play**: given that you learn your score after each round, how should your tightness respond to being ahead of or behind the required pace? Sketch a dynamic-programming formulation of the optimal adaptive strategy.
```hint Threshold vs expectation
Maximizing expected score is not the same objective. With per-hit score $s = 1/r$ you need at least $\lceil 2/s \rceil$ hits out of 5, so the pass probability is a binomial tail $P(\mathrm{Bin}(5, p) \ge \lceil 2/s \rceil)$. Compare $s = 0.4$ (must hit 5/5), $s = 0.5$ (4/5), and $s = 2/3$ (3/5).
```
```hint State-based play
Define $V(k, t)$ = best achievable pass probability with $k$ rounds left and $t$ score still needed. Each round you pick a log-width $w$: $V(k, t) = \max_w \left[ p(w)\, V(k-1,\, t - e^{-w}) + (1 - p(w))\, V(k-1,\, t) \right]$. Think about what the step-shaped terminal condition implies: is there ever value in quoting tighter than the state requires?
```
#### What This Part Should Cover
- Clear separation of the two objectives (expected score vs probability of clearing 2.0) and why variance matters for a threshold.
- A quantitative fixed-strategy comparison: per-hit score, required hit count, binomial pass probability.
- Adaptive logic: widen and lock in safe points when ahead; tighten and accept more risk when behind; no benefit to over-scoring beyond what the remaining target requires.
- A correct DP/state formulation with sensible boundary conditions.
### What a Strong Answer Covers
Across all parts, the interviewer is evaluating quantitative judgment under uncertainty, not just algebra:
- Connecting the game to market making: quoting $[L, U]$ is quoting a two-sided market around fair value — a tight spread earns more but gets run over when you are wrong, a wide spread is safe but earns little.
- Treating your own uncertainty as an explicit distribution and reasoning multiplicatively (log space, geometric centering) rather than anchoring on a single point guess.
- Risk management against a pass/fail threshold: trading expected value against variance, and adapting risk to the running score.
- Fast, explicit mental arithmetic — ratios, required paces, and simple binomial tails computed live under interview pressure.
### Follow-up Questions
- How does your strategy change if the game is 10 rounds with a target of 4.0? Does a longer horizon favor tighter or wider quotes, and why?
- Suppose for one question your uncertainty spans several orders of magnitude (you genuinely don't know if the answer is 1,000 or 1,000,000). What does the $L/U$ rule imply about your attainable score, and what should you quote?
- If you were shown all 5 questions up front before quoting any intervals, how would you allocate risk across them?
- Adaptive play says to tighten when behind. What real-world trading failure mode does "tightening your quotes to catch up" correspond to, and how do desks guard against it?